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Cell-library  design  approaches  have  been  the  most  popular  VLSI  design 
methods  for  the  past  decade.  General  cell  layout,  which  uses  cells  with  different 
shapes  and  sizes,  is  the  cell-library  design  approach  widely  used  for  high- 
complexity,  high-density  VLSI  layout  design.  However,  one  major  drawback  of 
this  design  approach  is  that  the  shapes  and  sizes  of  the  prestored  cells  usually  do 
not  fit  well  into  the  allotted  chip  subareas.  Such  kind  of  cell  shape  mismatch 
causes  sizable  redundant  chip  area  and  requires  laborious  cell  redesign  work  to 
improve  it.  In  order  to  alleviate  the  problem,  an  automatic,  flexible  cell  layout 
generation  method  has  been  studied  and  developed.  The  developed  method  uses 
delicate  cell  structure  modeling  and  layout  techniques  to  produce  a custom-fit 
symbolic  leaf  cell  from  a transistor  connectivity  description  and  a description  of 
the  environment  in  which  the  cell  will  reside.  The  cell  generation  system  not  only 
produces  good  quality  cell  layouts  within  short  turn-around-time  but  also  reduces 
redundant  chip  area  with  custom-fit  cells.  Consequently,  the  function  of  the  gen- 
eral cell  layout  system  is  greatly  enhanced. 


CHAPTER  I 
INTRODUCTION 


With  the  advance  of  IC  technology,  the  chip  complexity  has  grown  to  the 
order  of  hundred  thousands  gates  level.  Consequently,  the  difficulty  of  integrated 
circuit  design  has  increased  exponentially.  Design  and  layout  for  IC  packages 
have  become  so  time-consuming  that  it  usually  requires  a few  years  and  several 
good  designers  to  develop  a state-of-the-art  custom  VLSI  chip.  In  order  to 
reduce  the  chip  turn-around- time  and  the  closely  related  chip  design  cost, 
numerous  systematic  design  methodologies  and  automatic  layout  techniques  have 
been  proposed. 

Polycell  layout  (see  Figure  1.1(a)),  or  called  standard  cell  layout,  was  the 
first  cell-based  automatic  layout  method  proposed  in  early  1970s.  It  has  been 
successfully  used  in  many  automated  layout  systems,  including  MP2D  [l],  LTN 
[2],  and  CALMOS  [3].  In  most  such  systems,  all  cells  are  prepared  to  have  the 
same  height  so  as  to  simplify  the  layout  structure  and  to  reduce  the 
layout  design  time.  Unfortunately,  this  kind  of  constraint  considerably  limits 
the  range  of  cell  complexity  and  creates  interconnection  problems  for  com- 
plicated cells  [4]. 

General  cell  layout  (see  Figure  1.1(b)),  or  called  building  block  layout, 
has  been  discussed  for  years  as  an  alternative  approach  for  cell-based  automatic 
layout.  In  such  systems,  cells  are  allowed  to  have  different  shapes  and  sizes  which 
alleviates  the  problems  created  by  polycell  layout  system.  By  using  a hierarch- 
ical approach  [5],  general  cell  layout  systems  can  handle  chip  design  with 
very  high  complexity.  In  addition,  general  cell  layout  systems  usually  produce 
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Figure  1.1  Sample  Chip  Floor  Plans,  a)  Polycell  Layout;  b)  General 
Cell  Layout 
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Figure  1,2  Comparison  of  the  Layout  Performance  of  Different 
Approaches , 
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chips  with  smaller  sizes  compared  with  polycell  layout  systems.  The  superior- 
ity in  chip  size  for  general  cell  layout  system  is  demonstrated  in  Figure  1.2. 
Although  the  chip  size  is  still  much  larger  compared  with  the  full-custom  layout 
of  random-logic  networks,  it  is  about  50%  smaller  than  the  size  of  the  chips  gen- 
erated by  polycell  layout  systems  [6]. 

Although  the  general  cell  layout  method  is  obviously  superior  in  reducing 
chip  area,  some  major  drawbacks  have  severely  limited  its  function  in  practical 
use.  In  most  of  the  current  general  cell  layout  systems,  only  the  block-level  lay- 
out (chip  floor  plan)  is  automated.  Functional  blocks  (cells)  are  manually 
designed  and  prestored  in  the  cell  library  for  the  use  of  chip  floor  planning  pro- 
cess. Since  the  shapes  of  all  the  cells  are  fixed  beforehand,  sizable  redundant  chip 
area  is  caused  by  shape  mismatch  between  neighboring  cells.  Figure  1.3(a)  is  an 
example  which  shows  that  the  shape  mismatch  between  cell  A and  cell  B causes 
redundant  chip  area  which  is  marked  by  hatched  lines.  To  reduce  such  redun- 
dant area,  researchers  have  proposed  placement  improvement  techniques 
such  as  rotation,  squeezing,  reflecting  [7],  and  block  packing  [8],  However,  the 
block-level  placement  improvement  techniques  are  constrained  by  the  initial 
floor  plan  structure  and  the  fixed  cell  configurations,  and  can  only  be  applied 
to  limited  cases.  Figures  1.3(b)  to  1.3(d)  show  that  the  redundant  chip  area  can- 
not be  reduced  by  any  of  the  block-level  techniques  in  Lauther  [8],  In  order 
to  resolve  the  problem,  configurations  of  the  cells  have  to  be  made  flexible.  Fig- 
ure 1.3(e)  shows  that  the  redundant  chip  area  in  Figure  1.3(a)  is  eliminated  by 
reshaping  cells  A to  a different  configuration.  In  the  past,  such  cell  redesign  work 
was  done  by  experienced  designers.  This  work  was  very  time  consuming  because 
multiple  iterations  were  usually  required  for  a good  chip  layout  design.  In 
terms  of  this  need,  an  automatic,  flexible  cell  layout  generation  system  is 
considered  as  indispensable  in  enhancing  the  function  of  the  current  general  cell 
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Figure  1.3  Improve  the  Redundant  Chip  Area  Caused  by  Cell  Shape  Mismatch, 
a)  Original  Chip  Floor  Plan;  b)  Rotate  Cell  A and  B; 
c)  Rotate  Cell  A;  d)  Rotate  Cell  B;  e)  Reconfigure  Cell  A. 
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layout  systems.  The  work  presented  in  this  dissertation  pursues  this  research 
direction. 

There  are  two  main  objectives  sought  in  this  dissertation.  The  first 
objective  is  to  study  the  interactive  behavior  between  the  chip  floor  plan  and  cell 
design.  The  problem  of  cell  design  is  considered  under  the  global  chip  design 
environment.  The  second  objective  is  to  transcend  the  automatic  cell  layout 
schemes  of  the  past  and  develop  new  layout  techniques  that  lead  to  improved 
layouts  of  the  cells  which  meet  the  requirements  of  the  chip  floor  plan. 

The  design  of  the  cell  layout  generation  system  is  based  upon  one  of  the  sym- 
bolic layout  methods  [9-14]  --  Gate  Matrix  Layout  [9].  Layout  generation  is  exe- 
cuted on  a virtual  grid  structure  with  equally  spaced  grids.  Topology  of  the 
layout  primitives,  i.e.  devices  and  wires,  is  determined  by  graph-theoretical  and 
heuristic  algorithms  and  is  recorded  as  a set  of  symbols  located  on  the  grid 
structure.  The  symbolic  layout  is  then  compacted  by  cell  compactors  [15-19] 
to  generate  physical  layout.  Because  most  of  the  enumeration  and  computation 
are  performed  at  discrete  and  symbolic  levels,  the  cell  design  time  is  greatly 
reduced.  Moreover,  the  symbolic  layout  approach  isolates  the  design  rules  from 
the  layout  algorithms  and  makes  the  layout  technology  independent.  The  cost 
of  designing  and  maintaining  the  cell  library  is  also  highly  reduced. 

Chapter  II  presents  a survey  on  several  recent  automatic  symbolic  layout 
generation  algorithms.  Performance  and  constraints  for  those  algorithms  are 
comparatively  discussed. 

Chapter  III  gives  a complete  overview  of  the  automatic  cell  layout  genera- 
tion problem.  The  problem  is  considered  under  the  whole  VLSI  layout  environ- 
ment. A flexible  grid  structure  is  proposed  for  cell  layout  such  as  to  simplify  the 
design  and  to  increase  the  flexibility.  Layout  problems  are  then  formulated  based 
on  the  grid  structure. 
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Chapter  IV  carries  two  major  geometric  analyses  to  determine  the  cell 
configurations.  The  chip  floor  plan  analysis  is  used  to  reassign  the  cells  with  new 
chip  subareas  such  that  the  shape  mismatch  between  neighboring  cells  is  minim- 
ized. The  purpose  of  the  cell  boundary  analysis  is  to  choose  a grid  structure  for 
the  cell  layout  process  such  that  the  shape  mismatch  between  the  cell  and  its 
apportioned  chip  subarea  is  minimized. 

Chapter  V illustrates  the  device  placement  algorithms  which  determine 
the  locations  of  the  devices  on  the  grid  structure.  Three  grid  locations  of 
the  devices  are  to  be  determined  in  this  process.  Task  I uses  a linear  order 
algorithm  to  resolve  the  order  of  device  columns  and  determines  the  hor- 
izontal grid  locations  of  the  devices  at  the  same  time.  Task  II  employs  several 
circuit  heuristics  to  determine  the  relative  positions  of  the  source  and  drain  ter- 
minals on  each  device.  Task  III  utilizes  a vertical  constraint  graph  concept  to 
determine  the  vertical  grid  locations  of  the  devices. 

In  Chapter  VI,  a device  wiring  algorithm  for  cell  layout  is  described.  The 
objective  of  this  process  is  to  realize  the  circuit  connectivity  with  a minimum 
number  of  wire  segments  and  contacts.  The  process  is  also  decomposed  into 
three  tasks.  Terminal  labeling  is  a task  which  labels  the  net  number  of  each 
device  terminal  on  its  grid  location.  This  task  provides  a data  structure  for  the 
use  of  the  other  two  routing  tasks.  All  the  grid  points  with  the  same  net 
number  constitute  a multi-point  net  which  should  be  connected  by  wire  seg- 
ments. The  task  of  wire-list  generation  is  then  used  to  find  a shortest  path 
which  connects  each  terminal  of  the  multi-point  net.  The  final  task  of  device 
wiring,  which  is  called  detail  routing,  is  then  implemented  to  realize  the  gen- 
erated shortest  path  with  minimum  number  of  wire  segments  and  contacts. 

Chapter  VII  summarizes  the  major  accomplishments  reported  in  this 
dissertation  and  provides  suggestions  for  further  research. 


CHAPTER  II 

PREVIOUS  RESEARCH  ON  AUTOMATIC  CELL  LAYOUT  ALGORITHMS 


During  the  past  two  decades,  considerable  efforts  have  been  devoted  toward 
replacing  manual  IC  layout  design  work  by  computers.  Consequently,  many 
automatic  layout  algorithms  have  been  reported.  However,  only  a few  algo- 
rithms [20-34]  have  been  developed  for  automatic  cell  layout  design.  One  of 
the  major  reasons  for  the  lack  of  success  in  this  area  is  that  human  visual 
inspection  is  usually  better  than  CAD  software  in  designing  networks  that  match 
the  shapes  and  sizes  of  usable  subareas  of  the  chip.  Automatic  generation  of 
flexible  cells  for  the  use  of  general  cell  VLSI  layout  systems  is  a problem  yet  to 
be  resolved. 

A survey  on  the  various  research  and  development  studies  on  automatic  cell 
layout  generation  is  given  below.  The  success  of  an  automatic  cell  layout 
algorithm  relies  on  whether  it  can  exploit  the  design  search  space  with 
underlying  modularity  and  geometric  simplicity.  On  the  contrary,  the 
underlying  layout  structure  should  be  flexible  enough  to  allow  the  generated 
cells  to  have  different  configurations.  The  discussion  of  the  layout  algorithms 
will  be  focused  on  these  two  main  themes:  geometric  regularity  and  geometric 
flexibility  of  the  layout  environment. 

2J Automatic  Cell  Layout  Algorithms  Based  on 

Weinberger’s  Layout,  Structure 

Weinberger’s  layout  array  [20]  was  the  first  structured  design  methodology 
proposed  for  cell  layout.  In  his  method,  each  circuit  was  implemented  as  a 
one-dimensional  array  of  NAND/NOR  gates.  Each  NAND/NOR  gate  has  an 


8 


output  and  one  or  several  inputs  and  is  realized  as  an  enlongated  vertical 
diffusion  area  with  horizontal  cross-over  input  signals.  Input  signals  are  con- 
nected with  the  NAND/NOR  gates  through  the  formation  of  input  devices, 
which  are  formed  between  the  NAND/NOR  diffusion  strip  and  a parallel 
ground  diffusion  strip.  The  connections  of  the  output  signals  are  formed  by 
placing  contacts  between  signals  and  NAND/NOR  diffusion  strips. 

Ohtsuki  et  al.  [21]  were  the  earliest  researchers  to  develop  automatic  lay- 
out algorithms  based  on  Weinberger’s  layout  structure.  Their  method  modeled 
each  circuit  (see  Figure  2.1)  as  a netlist  structure  (see  Figure  2.2),  where  each 
column  represents  a NAND/NOR  gate  and  the  horizontal  wires  represent  the 
between-gate  connections.  An  interval  graph  is  then  formulated  as  shown  in 
Figure  2.3.  Each  vertex  of  the  interval  graph  indicates  a net  segment  in  Figure 
2.2.  An  edge  between  two  vertices  indicates  that  the  net  segments 
represented  by  those  two  vertices  are  overlapped.  Their  problem  is  then  for- 
mulated as  finding  a permutation  of  the  gates  of  which  the  interval  graph  has 
the  minimum  clique  number  (the  number  of  the  vertices  of  a complete  sub- 
graph). Since  the  clique  of  the  interval  graph  indicates  the  number  of  net  seg- 
ments which  are  overlapped  and  cannot  be  placed  at  the  same  track,  a 
minimum  number  of  cliques  indicate  a minimum  number  of  horizontal  tracks 
required  for  the  placement  of  net  segments.  Consequently,  the  cell  area  is  minim- 
ized. However,  there  are  several  drawbacks  associated  with  this  study.  One  is 
that  the  search  for  the  minimum  clique  of  any  graph  is  a classical  NP-complete 
problem  [35].  Consequently,  they  had  to  rely  on  heuristic  algorithms  to  resolve 
the  problem.  A near-optimum  instead  of  an  absolute-optimum  result  was 
derived.  Another  problem  is  that  sizable  redundant  cell  area  (white  space)  is 
usually  created  by  using  gates  instead  of  transistors  as  layout  primitives.  The 
generated  cells  are  generally  much  larger  than  the  manually  designed  cells.  The 
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Figure  2.1  A Sample  NAND  Gate  Logic  Circuit. 
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Figure  2.2  Netlist  Representation  of  the  Circuit  in  Figure  2.1. 
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Figure  2.3  Interval  Graph  Representation  Corresponding  to 
the  Placement  in  Figure  2.2. 


10 


other  problem  for  this  approach  is  that  there  is  no  topological  flexibility 
for  the  generated  cells.  For  a given  circuit,  the  generated  cell  layout  will  have 
only  one  kind  of  configuration  because  a single-row  layout  structure  is  assumed. 
The  generated  cells,  therefore,  have  very  limited  use  in  general  cell  layout 
systems  or  full-custom  design. 

Another  work  which  used  Weinberger’s  layout  structure  was  developed  by 
Uehara  and  vanCleemput  [22].  In  order  to  reduce  white  space  within  the  cells, 
transistors  were  used  as  layout  primitives  in  their  algorithm.  They  also  used 
CMOS  technology  to  replace  the  original  MM  OS  technology.  In  their  approach, 
circuits  were  assumed  to  be  constructed  by  the  complementary  switches  cir- 
cuit design  method  [36].  Each  circuit  is  realized  as  series  and  parallel  combi- 
nations of  transistors  where  the  p-structure  was  always  the  logical  dual  of  the  n- 
structure.  Consequently,  an  CMOS  transistor  circuit  can  be  converted  to  a spe- 
cial graph  where  1)  the  vertices  in  the  graph  are  the  source/drain  connections, 
and  (2)  the  edges  in  the  graph  are  transistors  that  connect  particular  source-drain 
vertices.  Two  graphs  result,  one  for  the  n-logic  tree  and  one  for  the  p-logic  tree. 
Figure  2.4  shows  an  example  of  the  graph  transformation.  The  connection  of 
edges  in  the  graphs  mirrors  the  series-parallel  connections  in  the  circuit.  If  two 
edges  are  adjacent  in  the  p-  or  n-graph,  then  they  may  share  a common 
source-drain  connection  and  may  be  connected  by  abutment.  Furthermore, 
if  there  exists  a sequence  of  edges  in  the  p-graph  and  n-graph  that  have 
identical  labeling,  then  the  gate  may  be  designed  with  no  breaks.  This  path  is 
known  as  an  Euler  path  [35],  which  is  defined  as  a closed  walk  running  through 
every  edge  exactly  once.  The  main  points  of  the  algorithm  are  as  follows: 

1.  Find  all  Euler  paths  that  cover  the  graph. 

2.  Find  a p-  and  n-  Euler  path  that  have  identical  labeling  (a  labeling  is 
an  ordering  of  the  gate  labels  on  each  vertex). 
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Figure  2.4  CMOS  Logic  Gate  Graph  Representation,  a)  A Sample 
Circuit;  b)  Graph  Representation  of  a). 
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Figure  2.5  Layout  Design  by  Euler  Path,  a)  Euler  Paths  of 
Figure  2.4;  b)  The  Corresponding  Layout. 
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3.  If  (2)  is  not  found,  then  break  the  gate  in  the  minimum  number  of 
places  to  achieve  (2)  by  separate  Euler  paths. 

In  this  approach,  cell  area  is  minimized  through  the  maximization  of  the 
number  of  transistor  abutments.  Figure  2.5  shows  an  example  of  cell  layout 
design  by  Euler  graph.  However,  there  are  two  drawbacks  for  this  method. 
First,  the  Euler  path  searching  process  is  also  an  NP-complete  problem  which 
is  difficult  to  be  implemented.  Second,  their  assumption  of  two-row  lay- 
out structure  also  limits  the  flexibility  of  cell  configurations. 

2.2  Algorithms  Rased  on  date  Matrix  Layout  Structure 

The  Gate  Matrix  Layout  method  [9]  is  an  evolution  from  Weinberger’s 
layout  methodology.  It  intended  to  provide  a layout  method  which  can  generate 
compact  CMOS  cells.  The  idea  of  this  method  is  to  use  transistors  and  wires  as 
the  layout  primitives.  An  array  of  vertical  polysilicon  lines  is  used  as  both  the 
sites  of  transistors  and  the  connecting  paths  of  the  transistors.  The  transistors 
are  placed  on  the  polysilicon  lines  and  are  interconnected  by  three  kinds  of 
paths:  vertical  polysilicon  lines,  vertical  diffusion  runs,  and  horizontal  metal 
wires.  Diffusion  columns  are  located  between  two  polysilicon  lines.  Horizontal 
metal  wires  are  placed  on  matrix  rows  together  with  the  devices 
(transistors).  The  devices  are  placed  horizontally  with  the  vertical  polysilicon 
lines  crossing  the  gate  terminal  and  diffusion  columns  crossing  the  drain  and 
source  terminals.  The  general  structure  of  Gate  Matrix  Layout  is  shown  in 
Figure  2.6. 

Only  a couple  of  researchers  have  proposed  automatic  layout  algorithms 
[23-26]  based  on  Gate  Matrix  Layout  method.  Both  of  them  used  the  same  inter- 
val graph  procedures  proposed  by  Ohtsuki  et  al.  They  formulated  an  CMOS 
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(a) 


vss 


VDD 


1 2 3 4 5 6 7 


(c) 


Figure  2.6  The  Gate  Matrix  Layout  Structure,  a)  Vertical  Grid 
Structure;  b)  Horizontal  Grid  Structure;  c) 
Realization  of  a Circuit  on  Gate  Matrix. 
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transistor  circuit  with  the  same  netlist  structure  as  shown  in  Figure  2-1.  Unfor- 
tunately, since  the  gate  matrix  structure  is  naturally  different  from  the 
Weinberger’s  array  structure,  the  same  netlist  structure  can  not  be  applied 
directly.  The  main  problem  with  their  approach  is  that  they  neglect  the  vert- 
ical wiring  during  the  problem  formulation.  For  a Weinberger’s  single-gate 
row  structure  or  two-device-row  structure  as  proposed  by  Uehara  et  al  [22],  no 
vertical  wiring  is  needed  because  the  vertical  wiring  is  realized  by  the  vertical 
gates’  or  devices’  area.  However,  since  there  are  more  than  one  device  rows  in 
Gate  Matrix  Layout  structure,  the  vertical  wiring  between  two  p-device  rows, 
one  p-device  row  and  one  n-device  row,  and  the  device  rows  and  the 

power/ground  nets  should  be  considered.  Especially  as  the  Gate  Matrix  is  a com- 
pact layout  structure,  the  vertical  wiring,  which  usually  causes  vertical 

channel  congestion  problem,  is  usually  the  most  difficult  problem  to  deal  with. 

2.3  Layout  Algorithms  Rased  on  Circuit  Embedding 

Circuit  embedding  is  a classical  problem  in  graph  theory.  Given  a cir- 
cuit, if  we  represent  each  component  as  a vertex  and  each  wire  as  an  edge, 
then  there  is  a corresponding  graph  for  the  circuit.  A drawing  of  a geometric 
representation  of  a graph  on  any  surface  such  that  no  edges  intersect  is  called 

embedding.  This  kind  of  approach  assumes  no  layout  structure  and  is 

regarded  as  a different  category  from  the  previous  two  layout  methodologies. 

The  circuit  embedding  method  was  first  applied  by  Rose  and  Oldfield  [27] 
for  the  embedding  in  single-layer  circuit  boards.  Engl  et  al.  [28]  used  a graph 
theoretical  approach  to  embed  bipolar  analog  circuits  into  integrated  circuits. 
Van  Lier  and  Otten  [29]  used  a similar  approach  to  solve  the  problem  of  embed- 
ding bipolar  analog  circuits  in  monolithic  technology.  Ng  and  Johnsson  [30] 
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used  the  same  approach  to  develop  automatic  layout  algorithms  for  MOS  cir- 
cuit. In  his  approach,  he  modeled  a circuit  as  a special  circuit  graph.  He  then  used 
a path  finding  algorithm,  developed  by  Hopcroft  and  Tarjan  [37],  to  find  the 
hierarchy  of  paths  which  record  the  information  of  the  whole  circuit.  A cou- 
ple of  constraint  graphs  are  then  derived  from  the  path  tree  which  records  the 
topological  constraints  of  the  paths.  The  circuit  embedding  topology  can  be 
generated  by  simultaneous  assignment  or  gradual  assignment  of  the  paths 
which  satisfy  the  topological  constraints. 

2.4  Discussion 

In  this  survey,  we  note  that  most  of  the  approaches  treated  cell  design 
as  a problem  which  was  separated  from  chip  design.  Each  algorithm 
emphasized  only  minimizing  the  individual  cell  area  and  offered  limited  study  of 
the  impact  of  the  cells  on  the  chip  design.  In  practical  VLSI  design,  however, 
cell  configuration  is  usually  more  important  than  cell  area  in  global  optimi- 
zation of  chip  area.  Optimization  of  the  cell  design  needs  to  consider  the  whole 
chip  design  environment. 

Another  problem  with  the  previous  approaches  is  that  most  of  the  algo- 
rithms used  a simple  graph  model  to  represent  the  layout  structure.  Since  the 
modeling  is  usually  oversimplified  and  sometimes  inaccurate,  the  implementa- 
tion result  can  be  far  different  from  the  theoretical  result.  Consequently,  even 
if  the  theoretical  result  proves  to  be  good,  the  physical  layout  may  be  poor. 

In  retrospect,  if  we  are  to  design  an  automatic  cell  generation  system,  it 
is  important  that  we  make  note  of  the  following  points: 

(l)  Cell  layout  generation  is  not  a stand-alone  problem.  Optimization  of 
the  cell  design  is  meaningless  unless  it  can  improve  the  performance 
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of  the  chip. 

(2)  The  constraints  of  the  layout  domain  have  to  be  relaxed  somewhat 
to  allow  certain  flexibility  in  cell  configurations. 

(3)  A delicate  modeling  technique  is  needed  to  accurately  transcend 
the  transformation  between  the  graph  model  and  symbolic  layout 
structure,  and  the  symbolic  layout  structure  and  physical  layout 
structure. 


C FLAP  TER  III 

CELL  LAYOUT  GENERATION  PROBLEM  OVERVIEW 


The  general  cell  layout  method,  as  mentioned  previously,  has  become  the 
most  popular  approach  for  efficient  VLSI  layout.  Previous  methods  applicable  to 
this  layout  approach  include  min-cut/slicing  [8,38],  constructive  [39],  analytic 
[40-42],  and  simulated  annealing  [43,44].  In  general,  the  general  cell  layout 
approach  can  fully  take  advantage  of  the  design  flexibility  in  cell 
configurations  and  is  expected  to  achieve  higher  chip  density  and  chip  perfor- 
mance than  the  polycell  layout  approach.  However,  in  most  of  the  current  sys- 
tems, the  chip  floor  planning  and  cell  generation  processes  are  considered 
separately.  The  lack  of  real-time  interaction  between  these  two  processes  has 
severely  degraded  the  performance  of  general  cell  layout  systems.  Sizable 
redundant  chip  areas  are  created  due  to  the  shape  mismatch  between  the  allotted 
chip  subarea  and  generated  cell  of  each  subcircuit.  Laborious  cell  redesign  work 
is  required  to  reduce  such  redundant  chip  areas. 

In  consideration  of  the  practical  design  environment,  an  integrated,  flexible- 
cell layout  system  has  been  developed.  An  automatic  cell  generation  system  is 
introduced  into  the  general  cell  VLSI  layout  environment  to  provide  real-time 
interaction  with  the  automatic  chip  floor  planner.  The  redundant  chip  area 
caused  by  cell  shape  mismatch  will  be  greatly  reduced  through  the  real  time, 
iterative  improvement. 

Four  sections  are  included  in  this  chapter.  In  the  first  section,  the  structure 
of  the  integrated,  flexible-cell  layout  system  is  described.  The  function  of  the 
proposed  automatic,  flexible  cell  generation  system  is  also  discussed.  In  the 
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second  section,  a symbolic  cell  layout  design  methodology,  Gate  Matrix  Lay- 
out method,  is  described.  A cell  structure  model,  which  links  the  symbolic  layout 
environment  with  the  physical  layout  environment,  is  also  presented.  In  the 
third  section,  the  symbolic  cell  layout  problems  are  formulated.  The  layout 
problem  is  divided  into  several  subproblems.  Finally,  the  last  section 
presents  the  system  structure  of  the  cell  layout  generation  process  which  solves 
the  formulated  cell  layout  problems. 

3.1  Overview  of  the  Integrated  Flexible-Cell  Layout  System 

During  the  initial  synthesis  stages  of  IC  design,  designers  attempt  to 
partition  chip  area  and  choose  attributes,  e.g.,  areas,  aspect  ratios,  and 
I/O  pin  locations,  for  each  partitioned  chip  subarea.  However,  it  is  difficult  for 
designers  to  comprehend  the  complicated  relationship  between  initial  design 
decisions  and  the  final  global  assignment  of  modules.  Moreover,  the  conse- 
quences of  the  designer’s  decisions  may  not  become  apparent  until  after  a 
significant  amount  of  the  IC  design  process  has  been  performed.  Conse- 

quently, it  is  often  the  case  that  a great  deal  of  iteration  is  required  between  the 
chip  floor  planning  and  the  cell  layout  design  tasks  in  order  to  produce  a 

satisfactory  design.  In  order  to  achieve  such  kind  of  real-time  interaction 

between  chip  floor  planning  and  cell  layout  design,  an  automatic  cell  generation 
system  is  introduced  in  the  general  cell  layout  system.  Figure  3.1  shows  the 
functional  flow  of  the  integrated  flexible-cell  layout  system.  An  automatic 

chip  floor  planner  partitions  the  chip  area  into  chip  subareas  according  to 
the  input  cell  attributes  such  as  cell  area,  cell  aspect  ratio,  and  inter-cell  connec- 
tivity. The  cell  generation  system  then  produces  custom-fit  cells  from  a 
transistor  connectivity  description  and  a description  of  the  environment  in  which 
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Figure  3.1 


Functional  Flow  of  the  Integrated  Flexible-cell  Layout  System 
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the  cell  will  reside. 

The  cell  generation  system  requires  two  inputs  as  shown  in  Figure  3.2.  First, 
the  circuit  must  be  defined  in  terms  of  transistor  connectivity.  Second,  the  cell 
generation  system  requires  information  about  the  environment  in  which  the  cell 
will  reside.  The  boundary  constraints,  which  include  block  dimensions,  aspect 
ratio,  and  external  signal  placement,  are  given  by  the  chip  floor  planner  with  an 
emphasis  to  optimize  the  chip  performance.  The  cell  generation  system  then 
synthesizes  a CMOS  symbolic  cell  layout.  A cell  compactor  is  used  to  convert 
the  symbolic  description  to  a mask  level  description  as  shown  in  Figure  3.3. 
Because  the  symbolic  layout  process  is  isolated  from  the  design  rules,  it  is 
assured  of  technology-independence.  However,  since  symbolic  layout  and  physi- 
cal layout  are  operated  in  different  layout  domains,  a cell  structure  model  is 
used  to  guide  the  transition  between  different  layout  domains  (see  Figure  3.4). 
Figure  3.6  displays  the  transistor  connectivity  description  of  a compare 
circuit  which  is  shown  in  Figure  3.5.  The  textual  description  is  based  on  the 
language  developed  by  GE  [45]  and  EDIF  group  [46].  Next,  the  environment  in 
which  the  compare  cell  will  reside  is  defined  as  in  Figure  3.7  and  is  depicted  in 
Figure  3.8.  Given  the  environment  and  the  transistor  connectivity,  the  cell 
generation  system  will  create  a symbolic  layout  for  the  compare  cell  (see  Fig- 
ure 3.9).  The  symbolic  layout  will  be  inserted  into  the  floor  plan  which  expects 
to  minimize  the  redundant  chip  area. 

3.2  The  Structure  Model 

As  described  in  Chapter  11,  the  success  of  an  automatic  cell  layout  algo- 
rithm depends  on  whether  it  can  exploit  the  design  search  space  with  underlying 


modularity  and  geometric  simplicity.  In  other  words,  a structured  design 
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Figure  3.2  The  Symbolic  Layout  Process. 


CIRCUIT 

DESCRIPTION 


Figure  3.3  The  Cell  Compaction  Process.  Figure  3.4  Transition  among 

Different  Circuit  Domains. 
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a 


Figure  3.5  A Compare  Circuit,  a)  Gate-level  Circuit;  b) 
Transistor  Circuit. 
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(cell  compare 
( structure 
( transistor 
( s ignal 

(a  ((gate  pi)  (gate  p5)  (gate  nl ) (gate  n5))) 
(b  ((gate  p7)  (gate  n 7 ) ) ) 

(c_in  ((gate  p4 ) (gate  n4))) 

(b_bar  ((source  p7)  (drain  n7)  (gate  p2) 

(gate  p3 ) (gate  n2)  (gate  n3))) 
(c_out_bar  ((source  p4)  (source  p5)  (gate  p6 ) 
(drain  n4)  (drain  n5)  (gate  n6 ) ) ) 
(II  ((source  pi)  (source  p2 ) (drain  p 4 ) ) ) 

(12  ((drain  nl ) (drain  n2)  (source  n4))) 

(13  ((source  p3)  (drain  p5))) 

(14  ((drain  n3)  (source  n5))) 

(c_out  ((source  p6 ) (drain  n6))) 

(vdd  ((drain  pi)  (drain  p2)  (drain  p3 ) 

(drain  p6  ) (drain  p7 ) ) ) 

(vss  ((source  nl)  (source  n2 ) (source  n3) 
(source  n6 ) (source  n7))) 

( instance 

( P_type  (pi  p2  p3  p4  p5  p6  p7 ) ) 

(n_type  (nl  n2  n3  n4  n5  n6  n7 ) ) ) ) ) ) 

Figure  3.6  Transistor  Circuit  Description  of  Figure  3.5(b). 


CRITICAL  PATH  OF  THE  CHIP  FLOOR  PLAN: 
HORIZONTAL  = 260 
VERTICAL  = 240 

LONGEST  PATH  OF  THE  CHIP  FLOOR  PLAN 

WHICH  PASSES  THROUGH  THE  CELL: 
HORIZONTAL  = 260 
VERTICAL  = 210 

ORIGINAL  CELL  DIMENSION: 

WIDTH  =80 
HEIGHT  = 30 

SUGGESTED  CELL  DIMENSION: 

WIDTH  = 40-60 
HEIGHT  = 50-70 

SUGGESTED  I/O  LOCATIONS: 

TOP VDD 

BOTTOM VSS 

LEFT b 

RIGHT  C out 


Figure  3.7  Cell  Boundary  Information  Issued  by  Chip  Floor  Planner. 
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Figure  3.8  Cell  Configuration  Suggested  by  the  Cell  Boundary  Information. 

a)  Original  Cell  Configuration;  b)  New  Cell  Configuration. 


Figure  3.9  Symbolic  Layout  of  the  Compare  Cell. 
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methodology  with  predictable  layout  environment  is  essential  for  the  cell  layout 
design  automation.  However,  the  layout  structure  should  be  flexible  enough  to 
ensure  the  design  flexibility  in  cell  configurations.  Among  the  well  known  cell  lay- 
out methods,  we  chose  Gate  Matrix  Layout  structures  over  other  layout  styles 
for  the  following  reasons: 

(1)  The  cell  structure,  with  power  net  and  ground  net  running  horizon- 

tally and  polysilicon  line  running  vertically,  makes  both  the  inter-cell 
abutment  and  the  cell  stacking  easy  to  be  implemented.  Conse- 
quently, the  chip  floor  planning  is  much  easier  with  those  cells. 

(2)  The  underlying  grid  structure  provides  a discrete,  predictable  design 

space  for  both  device  placement  and  routing.  The  development  of  lay- 
out algorithms  is  much  easier  under  the  layout  environment. 

(3)  The  aspect  ratio  of  the  cells  can  be  adjusted  by  relaxing  a con- 
straint assumed  by  Gate  Matrix  Layout.  Each  gate  signal  can  be 
assigned  to  more  than  one  device  column.  Therefore,  the  aspect  ratio 
of  a cell  can  be  adjusted  by  changing  the  number  of  device  columns. 

3.2.1  The  Grid  Structure  of  Gate  Matrix  Layout 

The  Gate  Matrix  Layout  method  [9]  provides  a discrete  layout  structure  for 
the  determination  of  the  topology  of  the  layout  primitives.  The  layout  primitives 
include  the  basic  elements  like  transistors  (devices),  wires,  and  contacts.  Each  ele- 
ment is  represented  as  a special  symbol.  The  symbols  are  then  placed  in  an 
equally  spaced  grid  matrix  with  an  emphasis  to  optimize  certain  objective  func- 
tions. The  grid  structure  consists  of  two  sets  of  grids.  Vertical  grids  are  com- 
posed of  an  array  of  polysilicon  grids  and  diffusion  grids  as  shown  in  Figure  2- 
6(a).  In  the  original  Gate  Matrix  structure  proposed  in  Lopez  and  Law  [9],  each 
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gate  signal  can  be  assigned  to  only  one  polysilicon  grid  and  only  one 
diffusion  grid  is  allowed  between  each  two  adjacent  polysilicon  grids.  Those 
layout  rules  severely  limit  the  flexibility  of  cell  configurations.  In  this  disserta- 
tion, we  modify  the  layout  structure  such  that  each  gate  signal  can  be  imple- 
mented as  a set  of  interconnected  polysilicon  grids.  More  than  one  diffusion 
grid  can  be  inserted  into  each  pair  of  adjacent  polysilicon  grids.  The  modified 
vertical  grid  structure  is  shown  in  Figure  3.10(a).  With  such  modification, 
more  cell  configurations  can  be  enumerated.  Moreover,  more  routing  algo- 
rithms can  be  chosen  with  the  less  rigid  layout  structure. 

The  horizontal  structure  of  the  matrix  is  composed  of  power  net,  ground 
net,  device  rows,  and  horizontal  routing  channels  as  shown  in  Figure  2.6(b) 
which  is  redrawn  in  Figure  3.10(b).  The  power  net  and  ground  net  are  running  at 
the  top  and  bottom  of  the  cell  respectively.  Devices  are  placed  on  horizontal 
grids  which  are  called  device  rows.  Horizontal  wire  segments  which  connect 
the  devices  can  be  placed  on  either  device  rows  or  horizontal  routing  channels. 
In  order  to  simplify  the  design,  horizontal  wires  are  usually  realized  by  metal 
while  the  vertical  wires  are  realized  by  diffusion  line. 

3.2.2  The  Cell  Structure. 

With  the  grid  structure  described  in  section  3.2.1,  the  physical  cell  dimen- 
sion can  be  represented  as  a cell  structure  model  as  shown  in  Figure 
3.11.  In  vertical  dimension,  the  cell  consists  of  device  rows  and  horizontal 
routing  tracks.  Let  the  height  of  each  device  row  be  hr  and  the  height  of  each  hor- 
izontal routing  track  be  ht , then  the  cell  height  is  given  by 
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(a) 


(b) 


Figure  3.10  Revised  Gate  Matrix  Layout  Structure,  a)  Vertical 
Grid  Structure;  b)  Horizontal  Grid  Structure. 
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(b) 


Figure  3.11  The  Cell  Structure  Model,  a)  Vertical  Cell  Structure; 
b)  Horizontal  Cell  Structure. 
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H=nrhT  + nt  ht 

(3-1) 

where  nr  is  the  number  of  device  rows  and  nt  is  the  number  of  horizontal  routing 
tracks.  In  horizontal  dimension,  the  cell  consists  of  device  columns  and  inserted 
routing  tracks.  Let  the  width  of  a device  column  be  wc  and  the  width  of  an 
inserted  vertical  routing  track  be  wv  , then  the  cell  width  is  given  by 

W=nc  wc+nvwv 

(3-2) 


where  nc  is  the  number  of  device  columns  and  nv  is  the  number  of  vertical  rout- 
ing tracks. 


3.3  Problem  Formulation 

One  of  the  most  difficult  features  of  the  cell  generation  problem  is  to 
generate  the  cells  with  desired  configurations.  Since  the  cell  configurations  are 
usually  bounded  by  their  layout  structures,  it  is  essential  to  determine  an 
appropriate  grid  structure  before  the  symbolic  layout  process  may  begin.  There- 
fore, the  cell  generation  problem  is  divided  into  two  major  subproblems:  cell 
boundary  analysis  and  symbolic  layout.  The  process  of  cell  boundary  analysis 
(see  Figure  3.12)  analyzes  the  circuit  information  and  the  boundary  information 
to  determine  a grid  structure  which  will  lead  to  a desired  cell  configuration.  The 
symbolic  layout  process  (see  Figure  3.13)  then  uses  the  grid  structure  to  realize  the 
circuit  connectivity  with  an  optimized  object  function. 

To  define  these  subproblems  more  precisely,  we  introduce  some  notations  and 
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Figure  3.12  The  Cell  Boundary  Analysis  Process. 


Figure  3.13  The  Symbolic  Layout  Process. 
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definitions.  Let  T = { £,•  | f(-  is  a transistor}  be  a set  of  transistors.  Let  S = { 5,- 
| Sj  is  a net}  be  a set  of  nets  which  describes  the  circuit  connectivity.  Each  net 

-Sy  is  described  as  a set  of  interconnected  device  terminals,  St={pll,pi2, pik }. 

Let  C be  a set  of  columns  and  R be  a set  of  rows.  Let  B be  the  set  of  boun- 
dary constraints  given  by  the  chip  floor  planner.  Now  the  cell  boundary 
analysis  problem  is  described  as 

[Grid  Assignment  Problem]:  Given  a set  of  transistors  T,  a set  of  nets  S which 

describes  the  connectivity  of  T,  and  a set  of  boundary  constraints  B,  find  an 
assignment  function  } x : [T ,S ,B)  ====>  (C,R)  such  that  (1)  all  the  transis- 
tors and  connectivity,  can  be  realized  within  the  grid  structure  (C.R)  (2)  The 
grid  structure  (C.R)  will  lead  to  a desired  cell  configuration. 

Now  we  formulate  the  symbolic  layout  problem.  As  shown  in  Figure  3.14,  a 
circuit  is  composed  of  a set  of  interconnected  transistors.  Each  transistor  con- 
sists of  three  terminals:  drain,  gate,  and  source.  Those  transistor  terminals 

and  their  connectivity  are  to  be  embedded  in  a grid  structure  which  is  determined 
in  the  grid  assignment  process.  Figure  3.15  shows  two  grid  structures  with 
different  cell  configurations.  Each  grid  structure  contains  P grid  locations  which 
can  be  used  for  the  placement  of  transistor  terminals.  Obviously,  the  permuta- 
tion number  PN  for  those  transistor  terminals  within  the  grid  locations  is  very 
huge.  If  we  assume  that  the  number  of  transistors  of  a circuit  is  N,  the  permuta- 
tion number  for  all  3N  transistor  terminals  among  P grid  locations  (P> 3Ar)  is 


PN=C{P,3N)(3N)\ 

The  computation  complexity  is  very  high  even  with  small  number  of  P and 
N.  Fortunately,  with  the  help  of  the  layout  constraints  from  the  Gate  Matrix 
structure,  the  computation  complexity  can  be  reduced  tremendously.  Figure 
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Figure  3.14  Transistor  Circuit  of  an  Exclusive  OR  Logic  Gate. 


(a) 


Figure  3.15  Two  Possible  Grid  Structures  for  the  Layout  of  the  Circuit 
in  Figure  3.14.  a)  Wide-Shape  Structure  with  17  Vertical 
Grids  and  4 Horizontal  Grids;  b)  Long-Shape  Structure  with 
9 Vertical  Grids  and  8 Horizontal  Grids. 
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Figure  3.16  Four  kinds  of  Constraints  for  Gate  Matrix  Structure. 
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3.16  shows  four  kinds  of  constraints  of  the  layout  structure.  Category  I is  called 
"intra-device  bond"  where  the  two  terminals  of  each  device  have  to  locate  at 
adjacent  horizontal  locations.  However,  these  two  terminals  are  allowed  to 
switch  positions.  Category  II  is  called  "intra-gate  bond"  where  all  the  devices 
which  are  driven  by  the  same  polysilicon  gate  have  to  stay  at  the  same  column. 
The  devices,  however,  are  allowed  to  permute  among  the  grid  locations  in  that 
column.  Category  III  is  called  "inter-device  bond"  where  two  adjacent  devices 
are  allowed  to  share  a common  terminal  location  if  those  two  terminals  are 
common  electrically.  However,  those  two  terminals  should  have  different  grid 
locations  if  they  are  not  common  electrically.  Category  IV  is  called  "power  net 
bond"  where  p devices  and  n devices  are  usually  placed  at  different  regions  which 
are  closer  to  the  power  net  and  ground  net  respectively.  By  using  all  those 
geometric  constraints,  the  layout  problem  can  be  decomposed  into  several 
assignment  functions.  Let  G’  = { </  ,•  | g1  ,•  is  a gate  column}  be  a set  of  gate 
columns.  Each  gate  column  cf  ,•  is  composed  of  a set  of  devices, 
{f  i = Let  O = (east, west}  be  a set  of  orientations.  Now  the  dev- 

ice placement  subproblem  can  be  described  as 

[Placement  Problem]:  Given  a set  of  gate  signals  G,  a set  of  transistors  T,  a set 
of  device  columns  C,  a set  of  device  rows  R,  and  a set  of  orientations  O.  find 
the  assignment  functions  f 2 : G — > C , f3  : T — > O , /4  : T — > R such 
that 

(1)  The  assignment  functions  results  in  a device  topology  which  is  com- 
pletely routable. 

(2)  The  total  wiring  length  for  the  device  topology  is  minimized. 

Now  we  define  the  routing  problem.  Let  S’  = { S'  ■ | S'  ,•  is  a net}  be  a set 

of  nets  which  are  described  by  grid  points.  Each  net  S'  i = { p'  ivp'  p'  jk  } 

where  p'  ik  = ( rik,cik)  represents  a grid  point  with  coordinates  ( rik  ,ctk).  Let  S" 
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= {S"  1?  S"  2,....,  S"  k } be  a set  of  nets  which  are  described  by  grid  wire  seg- 
ments. Each  net  S"  ,■  = { W"  tl  , W"  ,-2  ik_l  } where  W"  ik  — 

[{'rikvciki)i(rik2icik2)]  1S  a w're  segment  described  by  two  grid  points.  The  routing 
problem  is  described  as 

[Routing  Problem]  Given  a 3-tuple  assignment  function  (/2,/ 3,/4)and  a 
functional  description  of  a net  S,  find  the  assignment  functions  /5  : S — > S’ 
and  / 5 : S’  — > S"  such  that  the  total  wiring  length  is  minimized. 

3.4  Automatic  Cell  Layout  Generation  Process 

The  implementation  of  the  symbolic  topological  layout  system  is  described 
in  this  section.  Figure  3.17  shows  the  function  flow  of  the  system  which  is  com- 
posed of  four  major  tasks:  a circuit  analyzer,  a placer,  a router,  and  a layout 
improvement  mechanism.  The  circuit  analyzer  is  used  to  analyze  the  lexical 
structure  of  the  circuit  description  and  the  cell  boundary  constraints  of  the 
environment.  It  then  determines  the  grid  structure  and  generates  a set  of  data 
structures  for  topological  layout  process.  The  placer  is  divided  into  several 
subtasks.  Each  subtask  is  designed  to  individually  solve  a subproblem  described 
in  section  3.3.  The  subtasks  are  gate  column  order  assignment,  device 
orientation  assignment,  device  row  order  assignment.  Gate  column  order 
assignment  is  the  task  which  determines  the  horizontal  grid  location  of  each 
gate  column.  Device  orientation  assignment  is  to  determine  the  relative  grid 
locations  of  the  tAvo  terminals  of  each  device.  Device  row  order  assignment  is 
to  determine  the  vertical  grid  location  of  each  device.  The  router  of  the  system 
contains  three  subtasks:  terminal  labeling,  wire-list  generation,  and  routing. 

Terminal  labeling  is  to  label  the  net  number  of  each  device  terminal  on  its  grid 
location.  The  generated  data  structure  is  then  used  by  the  other  two  subtasks 
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Figure  3.17  Functional  Flow  of  the  Automatic  Cell  Layout  Generation  System. 
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for  fast  searching.  Wire  segment  generation  is  to  determine  the  shortest 
path  which  realizes  the  circuit  connectivity.  Routing  is  to  determine  the  grid 
locations  for  the  shortest  path.  A layout  improvement  task  is  also  introduced 
in  the  system  to  improve  the  layout  result  if  it  is  found  unsatisfactory 
in  cell  area  or  cell  configuration.  In  this  symbolic  layout  system,  a final  com- 
paction is  needed  to  convert  symbolic  layout  into  physical  mask  data.  The 
algorithms  can  be  seen  in  various  studies  [10,11,13]  which  will  not  be  dis- 
cussed in  this  dissertation. 


CHAPTER  IV 

ANALYSIS  OF  CELL  BOUNDARY  CONSTRAINTS 


As  described  in  section  3.3,  the  cell  generation  problem  is  divided  into  two 
major  subproblems:  cell  boundary  analysis  and  symbolic  layout.  The  process  of 
cell  boundary  analysis  is  to  determine  a grid  structure  based  upon  the  analysis  of 
cell  boundary  conditions  and  circuit  connectivity.  There  are  two  reasons  for 
determining  a grid  structure  before  the  symbolic  layout  process.  First,  a grid 
structure  provides  a layout  environment  for  discrete,  deterministic  analysis.  Com- 
putation is  based  on  the  discrete  structure.  Second,  cells  laid  out  on  a grid 
structure  with  different  row/column  combinations  will  have  different  cell 
configurations.  By  controlling  the  row  and  column  numbers  of  the  grid  structure, 
a cell  with  desired  configuration  will  be  generated.  In  this  chapter,  the  process 
of  determining  the  discrete  grid  structure  is  presented.  The  process  uses  the 
cell  structure  model  described  in  Chapter  III  to  determine  the  layout  grid  struc- 
ture from  the  analysis  of  physical  boundary  constraints. 

Three  sections  are  included  in  the  chapter.  In  the  first  section,  the  relation- 
ship between  the  cell  configuration  and  chip  floor  plan  is  analyzed.  The  problem 
to  determine  the  cell  configuration  is  formulated  as  a quadratic  programming 
problem.  Objective  function  for  this  problem  is  derived  based  upon  the  analysis  of 
the  chip  floor  plan.  The  second  section  describes  the  equality  and  inequality  con- 
straints for  this  problem  which  are  derived  from  the  analysis  of  the  grid 
structure  and  circuit  structure.  In  the  third  section,  procedures  of  solving 
the  quadratic  programming  problem  are  presented.  Examples  are  given  for  the 
illustration  of  the  procedures. 
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4.1  Analysis  o£  Chip  Floor  Plan 


Chip  floor  planning  is  the  process  of  dividing  the  available  chip  area 
into  a number  of  rectangular  subareas  that  best  satisfy  the  area,  connection, 
and  performance  constraints  imposed  by  the  design  specification.  An  example  of 
a chip  floor  plan  is  shown  in  Figure  4.1.  Each  chip  subarea  in  the  figure  is 
designated  for  the  placement  of  a cell  of  the  circuit.  Since  a chip  floor  plan  is 
composed  of  cells  with  irregular  shapes  and  sizes,  shape  mismatch  between  neigh- 
boring cells  usually  produces  sizable  dead  space  in  chip  area.  The  shape  mismatch 
between  a cell  and  its  approtioned  chip  subarea  also  causes  considerable  dead 
space  (see  Figure  4.2).  Consequently,  iterative  improvements  of  a chip  floor  plan 
are  needed.  The  purpose  of  this  section  is  to  develop  a sysmatic  method  to 
analyze  a chip  floor  plan  and  assign  new  chip  subareas  to  the  cells  such  that  the 
dead  space  in  the  chip  area  is  reduced.  To  facilitate  the  analysis  process,  we 
define  two  special  directed  graphs  and  use  them  to  represent  the  topological  infor- 
mation of  the  cells  and  chip  subareas  of  a chip  floor  plan.  Let  Gx  = { VX,EX  }, 
be  a horizontal  constraint  graph  [10]  and  Gy  — { Vy  ,Ey  } be  a vertical  constraint 
graph.  Each  constraint  graph  represents  the  relative  placement  of  the  modules 
(cells  or  chip  subareas)  along  one  dimension.  Modules  are  shown  as  vertices  in 
the  graph.  Each  edge  ex  ( ey  ) in  the  horizontal  (vertical)  constraint  graph  is 
represented  as  a pair  of  ordered  vertices  ( vi , vt  ).  The  vertex,  which  edge  ex  is 
incident  out  of,  is  called  the  initial  vertex  of  ex.  The  vertex,  which  ex  is  incident 
into  is  called  the  terminal  vertex.  For  a horizontal  constraint  graph,  an  edge  ( 
Vj , vt  ) indicates  that  two  modules  are  horizontally  abutted  and  that  vt  lies  to  the 
right  of  v,-.  An  edge  ( vi , vt  ) in  a vertical  constraint  graph  indicates  that  two 
modules  are  abutted  vertically  and  that  vt  lies  to  the  bottom  of  v{ . Each  vertex 
vx  ( vy  ) of  the  horizontal  (vertical)  graph  is  assigned  a weight  which  is  equal  to 
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Figure  4.1  A Sample  Chip  Floor  Plan. 
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Figure  4.2  Physical  Dimensions  of  the  Cells  in  Figure  4.1. 
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the  width  (height)  of  its  represented  module.  Let  wx  ( vx  ) be  the  weight  of  a ver- 
tex vx  in  horizontal  constraint  graph  and  wy  ( vy  ) be  the  weight  of  a vertex  vy  in 
vertical  constraint  graph.  Let  \VA  and  HA  be  the  width  and  height  of  module  A 
which  is  represented  as  vertex  vt  and  v]  in  the  horizontal  and  vertical  constraint 
graphs  respectively;  then 


wx{vi)=\VA 


(4-1) 


Wy(Vj)=HA 

(4-2) 

The  weight  of  an  edge  is  equal  to  the  weight  of  its  initial  vertex.  Let  xw(A) 
denote  the  position  variable  of  the  western  boundary  of  the  chip  subarea  assigned 
for  module  A.  Functions  xe(A)  , yn{A)  , and  ys{A)  are  similarly  defined  as  the 
eastern  boundary,  northern  boundary,  and  southern  boundary  of  module  A. 
For  a horizontal  graph,  a directed  edge  from  vertex  A to  B indicates  the  follow- 
ing conditions: 


*e(A)  = XW{B) 


(4-3) 


VsU)  < yniB) 


(4-4) 


VAB)  < Vn  iA ) 


(4-5) 
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It  means  that  the  chip  subareas  for  functional  modules  A and  B are  horizon- 
tally abutted  and  that  module  A is  located  at  west  side  of  module  B.  Similar 
conditions  can  be  derived  for  a directed  edge  in  the  vertical  constraint  graph: 


ysiA)  = yn{B) 


(4-6) 


xw{A)  < xe{B) 


(4-7) 


xw(B)  < xe  (A  ) 

(4-8) 

It  is  obvious  that  the  width  (height)  of  the  chip  area  CW  (CH)  is  equal  to 
the  length  of  the  longest  directed  path  in  the  horizontal  (vertical)  constraint 
graph.  The  longest  directed  path  of  the  horizontal  constraint  graph  is  called  a 
horizontal  critical  path  (HCP).  The  same  definition  is  applied  for  vertical 
critical  path  (VCP).  The  vertices  and  edges  in  a HCP  (VCP)  are  called  hor- 
izontal (vertical)  critical  events  and  the  horizontal  (vertical)  critical  activi- 
ties. Figure  4.3  depicts  the  two  constraint  graphs  for  the  chip  floor  plan  in  Figure 
4.1.  Critical  paths  in  both  of  the  graphs  are  shown  as  double  lines.  VCP  con- 
tains critical  events  of  A and  D.  HCP  contains  critical  events  of  D and  S.  The 
length  of  VCP  is  16  which  is  equal  to  the  height  of  the  chip  area  in  Figure  4.1. 
The  length  of  HCP  is  29  which  is  equal  to  the  chip  width  in  Figure  4.1.  Let 
L(P)  denote  the  length  of  any  path  P.  Then  the  chip  dimension  CH  and  CW 
can  be  represented  as  the  summation  of  all  the  cell  dimensions  along  the  respec- 
tive critical  path 
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Figure  4.3  Constraint  Graphs  for  the  Chip  Floor  Plan  of  Figure  4.1 
and  Figure  4.2.  a)  Vertical  Constraint  Graph;  b) 
Horizontal  Constraint  Graph.  The  Weight  of  Each  Edge 
Represents  the  Cell  Dimension  of  the  Initial  Vertex 
of  that  Edge. 
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CH  = L {HCP ) = £ wxM 

v,  tHCP 

(4-9) 


CW  = L(VCP)  = £ w,(Vj) 

V , € VCP 

(4-10) 


4.1.1  Optimization  of  Chip  Floor  Plan 

The  objective  function  of  the  chip  floor  planning  process  is  to  optimize 
the  total  chip  area.  In  other  words,  it  is  to  minimize  HCP  and  VCP  along  each 
dimension.  There  is  no  known  efficient  technique  for  producing  an  optimal 
solution  to  this  problem.  Accordingly,  we  developed  a heuristic  algorithm 
to  iteratively  improve  the  chip  floor  plan.  The  algorithm  is  to  find  a module  in 
the  HCP  or  VCP  (but  not  both)  and  to  reconfigure  the  chosen  module  such  that 
the  length  of  one  of  the  critical  paths  is  reduced  while  the  length  of  the  other 
one  remains  the  same.  Let  HSL  (VSL)  denote  the  second  longest  path  in  the 
horizontal  (vertical)  constraint  graph.  Suppose  that  we  choose  a module  m which 
is  a critical  event  of  HCP  and  is  not  a critical  event  of  VCP.  Then  the  length  of 
HCP  is  equal  to  the  sum  of  the  width  of  module  m and  the  width  of  the  other 
modules  in  the  HCP: 


L ( HCP ) = Wm  + L ( HCP  -{m  }) 

(4-11) 

By  reducing  the  width  of  module  m,  the  length  of  the  horizontal  critical 
path  L(HCP)  will  be  reduced.  However,  there  is  an  upper  bound  for  the  reduc- 
tion of  L(HCP): 
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AW  = L (HCP ) - L(HSL) 


(4-12) 


It  means  that  whenever  L(HCP)  is  reduced  more  than  the  amount  of  AW,  it  is 
shorter  than  another  path  HSL.  In  that  case,  HSL  becomes  the  new  horizontal 
critical  path.  Therefore  any  further  size  reduction  of  the  cells  along  the  original 
HCP  will  not  help  in  reducing  the  chip  dimension. 

Now  we  consider  the  other  dimension  of  the  chip  area.  Since  the  decrease  of 
one  cell  dimension  causes  the  increase  of  the  other  dimension,  we  have  to 
make  sure  that  the  increase  of  the  cell  height,  which  is  caused  by  cell 
reconfiguration,  does  not  increase  the  length  of  VCP.  Because  we  have  assumed 
that  module  meVCP , the  upper  bound  which  is  allowed  for  the  increase  of  the 
cell  height  while  the  length  of  VCP  is  not  increased  is 

AH  = L(VCP)-L(VLPm) 

(4-13) 

where  VLPm  represents  the  longest  path  in  vertical  constraint  graph  which  passes 
through  module  m. 

By  adding  AH  to  the  original  cell  height  Hm  and  reducing  A IP  from  the  ori- 
ginal cell  width  Wm  , the  new  cell  boundary  becomes 


Wt  = Wm  - AW 


(4-14) 


Ht=Hm+ AH 


(4-15) 
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Similar  equations  will  be  derived  if  we  choose  a module  which  is  a critical 
event  of  YCP  and  is  not  a critical  event  of  HCP 


AH  = L(VCP)  - L ( VSL ) 


(-1-16) 


A W = L (HCP ) -L(HLPm) 


(4-17) 


= W„  + AH' 


(4-18) 


H„  = Hm  - AH 

(4-19) 

The  new  cell  boundary  represents  the  best  cell  configuration  sug- 
gested for  the  the  chosen  module  to  be  reconfigured.  If  the  module  can  be 
reconfigured  to  a new  configuration  which  fits  into  the  cell  boundary,  the  chip 
floor  plan  will  be  improved.  To  illustrate  the  chip  floor  plan  improvement 
process  more  clearly,  we  follow  the  same  example  in  Figure  4.1.  From  Figure  4.3, 
we  know  that  module  A is  a critical  event  of  VCP  but  not  a critical  event  of 
HCP.  By  reconfiguring  module  A into  a different  shape  as  shown  in  Figure  4.4, 
the  original  chip  floor  plan  will  be  reduced  to  a size  of  29x13  (see  Figure  4.5). 

4.1.2  Objective  Function  for  Cell  Reconfiguration 


After  the  new  cell  boundary  is  calculated,  one  of  the  major  objectives  for 
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Figure  4.4  Reconfiguration  of  the  Adder  Cell  with  a Different  Shape. 


Figure  4.5  The  Chip  Area  Is  Reduced  after  the  Cell  Reconfiguration. 
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the  cell  reconfiguration  is  to  determine  a new  cell  configuration  which  best  fits 
the  new  cell  boundary.  The  configuration  of  the  reconfigured  cell  may  still 
not  fit  perfectly  with  the  new  designated  subarea.  However,  the  main  purpose 
of  this  cell  reconfiguration  is  to  try  to  reduce  the  redundant  chip  area  as  much 
as  possible.  Figure  4.6  shows  several  situations  which  cause  redundant  chip 
area.  In  case  1,  both  dimensions  of  the  cells  are  longer  than  those  of  the  cell  boun- 
daries. In  case  2,  the  height  of  the  cell  exceeds  the  cell  boundary  while  the  cell 
width  stays  within  the  boundary.  In  case  3,  the  width  of  the  cell  exceeds  the  cell 
boundary  while  the  cell  height  stays  within  the  boundary.  In  case  4,  neither  of 
the  cell  dimension  exceeds  the  boundary.  Let  W and  H denote  the  width  and 
height  of  the  cell  to  be  generated.  Then  the  objective  function  of  the  cell 
reconfiguration  is  to  minimize  the  redundant  cell  area  which  exceeds  the  desig- 
nated cell  boundary.  If  several  cell  configurations  cause  the  same  amount  of 
redundant  cell  area,  the  one  with  the  minimum  cell  area  will  be  chosen. 
Consequently,  the  objective  function  can  be  represented  as 

minimize 


y = RA  = aW{H  - Hb)  + 0H{  IF  - \Vb ) 
-ctftH  -Ht)(W  - IK,) 


(-1-20) 


where  a = 1 if  H > Hb  ; a = 0 if  H < Hb 


(4-21) 


/?  = 1 if  W > Wh  ; $ = 0 if  H < Hb 


(4-22) 
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Figure  4.6  Four  kinds  of  Cell  Shape  Mismatch,  a)  Both  of  the  Cell 
Height  and  Cell  Width  exceed  the  Cell  Boundary;  b)  Only 
the  Cell  Height  Exceeds  the  Boundary;  c)  Only  the  Cell 
Width  Exceeds  the  Boundary;  d)  Neither  of  the  Cell 
Dimensions  Exceed  the  Boundary. 
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4.2... Geometric  Constraints  for  the  Cell  Configuration 

The  Quadratic  Programming  problem  formulated  above  has  several 
geometric  constraints  which  limit  the  domain  of  its  feasible  solutions.  The 
geometric  constraints  come  from  two  main  sources:  layout  guidelines  and  Gate 
Matrix  layout  structure.  In  this  section,  all  those  geometric  constraints  are 
analyzed  and  formulated  as  equality  or  inequality  constraint  equations  for 
the  quadratic  programming  problem. 

4.2.1  Constraints  by  Discrete  Grid  Structure 

The  boundary  constraints  from  the  chip  floor  plan  are  usually  described  in 
terms  of  physical  dimensions.  However,  since  the  layout  is  implemented  on  a 
discrete  grid  structure,  the  selections  of  the  cell  configurations  are  limited  by  the 
number  of  possible  combinations  of  integer  grid  column  and  row  numbers.  As 
discussed  in  Chapter  II,  the  physical  cell  dimensions  are  represented  as  equations 
of  grid  numbers: 


\V  = ncwc  + nv  wv 


(4-23) 


H = nr  hr  + nt  ht 

(4-24) 

Among  the  equations,  wc  ,wv,hr,ht  are  technology  dependent  which  can  be 
estimated  with  given  design  rule  information.  Figure  4.7  shows  the  estimation 
of  the  physical  dimension  of  the  Gate  Matrix  structure  based  upon  the  CMOS 
design  rules  described  in  Weste  and  Eshraghian  [36].  Substituting  the  physical 
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Figure  4.7  Estimation  of  the  Physical  Cell  Dimensions,  a)  Vertical 
Dimension;  b)  Horizontal  Dimension. 
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dimensions  into  equations  4-20  and  4-21  yields 


W = 8 nc  + 6 nv 


(4-25) 


H = 6(  nr  + nt ) 


(4-26) 


4.2.2  The  Number  of  Grid  Columns 

In  the  original  structure  of  Gate  Matrix  layout  method,  each  gate  signal  in 
a circuit  is  assigned  a grid  column  in  the  cell  design.  Mathematically,  the 
number  of  grid  columns  will  be  equal  to  the  number  of  gate  signals.  In  the 
relaxed  Gate  Matrix  layout  system,  each  gate  signal  may  be  assigned  to  more 
than  one  gate  signal.  Therefore,  the  number  of  grid  columns  is  now  a variable 
instead  of  a constant.  However,  the  number  of  grid  columns  is  still  bounded  in  a 
certain  range  related  to  the  circuit  connectivity.  Since  stacking  and  folding  of 
cell  structure  is  not  considered,  each  signal  will  be  assigned  to  at  least  one  gate 
column.  Consequently,  the  minimum  number  of  grid  columns  is  equal  to  Ng 
which  is  the  number  of  gate  signals  of  the  circuit.  Due  to  the  assumption  of 
parallel  power  and  ground  lines  located  at  the  top  and  the  bottom  respectively, 
each  column  requires  at  least  2 devices.  Therefore,  the  maximum  number  of  grid 
columns  is  equal  to  Nd/ 2 where  Nd  denotes  the  total  number  of  devices  in  the  cir- 
cuit. Let  nc  denote  the  number  of  device  columns  needed  for  a circuit  to  be 
implemented  on  a gate  matrix.  Then  the  range  of  the  number  of  device  columns 
can  be  represented  as 
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Ng  < nc  < 


AL 


(4-27) 


4.2.3  The  Number  of  Device  Rows 

For  the  Gate  Matrix  layout  structure,  the  minimum  number  of  the 
devices  located  in  each  device  column  is  2.  That  is  also  the  minimum  required 
number  of  the  device  rows.  The  maximum  number  of  device  rows  equals  the  max- 
imum number  of  devices  assigned  to  any  of  the  device  columns.  Let  Ndi  be  the 
number  of  devices  located  on  ith  device  column  and  MAX (Ndi ) be  the  max- 
imum number  of  Ndi  for  all  i.  Then  the  range  of  the  number  of  device  rows  nr  for 
a given  circuit  can  be  represented  as 


2 <nr  < MAX (Ndi ) 


(4-28) 


The  exact  number  of  device  rows  may  depend  on  how  many  device  columns 
are  implemented  and  how  the  transistors  are  distributed  among  those  device 
columns.  Normally,  the  number  of  device  rows  decreases  as  the  number  of 
device  columns  increases.  However,  if  the  devices  are  unevenly  distributed,  a 
small  increase  of  device  columns  may  not  change  the  number  of  device  rows.  Let 
WS  denote  the  unused  grid  locations  caused  by  uneven  distribution  of  devices. 
Then  the  number  of  device  rows  can  be  represented  as 


N,,  + WS 


(4-29) 
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where  WS  can  be  represented  as  the  summation  of  the  unused  grid  locations  in 
each  device  column 


vrs  = £{MAX(Ndi)  - Ni{} 

1=1 

(4-30) 

Figure  4.8  shows  an  example  of  the  row-column  relationship.  The 
number  of  device  rows  decreases  as  the  number  of  device  columns  increases. 
However,  the  relationship  is  obviously  not  linear. 

4.2.4  The  Number  of  Routing  Tracks 

It  is  very  difficult  to  estimate  the  number  of  routing  tracks  before  the  lay- 
out process  actually  starts  because  the  information  is  not  sufficient.  Therefore, 
the  number  of  routing  tracks  can  only  be  roughly  estimated.  We  start  from 
estimating  the  average  horizontal  length  of  a net.  In  general,  there  are  two  kinds 
of  nets  whose  horizontal  length  should  be  estimated  differently.  The  nets  among 
the  interconnected  device  terminals  are  usually  decomposed  into  a set  of  two-point 
nets.  Each  two-point  net  is  usually  realized  as  two  vertically  intersected  wires  as 
shown  in  Figure  4.9.  If  we  assume  that  the  probability  of  the  location  of  any  net 
terminal  is  evenly  distributed  among  the  horizontal  grid  locations,  then  the 
average  horizontal  length  of  any  two-point  net  is 


h 


nc-\ 

E l(nc  ~ i) 

1=1 

C(nc.  2) 


n„ 


+ 1 


3 


(4-31) 
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A B C D 


A1  A£  B C D 


A1  AE  A3  A4  B1  BE  Cl  CE  D1  BE  D3 


Figure  4„8  Different  Grid  Row  and  Column  Combinations. 


Figure  4.9  A Two-Point  Net 


Figure  4.10  A Multi-Point  Net 
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However,  the  nets  among  each  set  of  interconnected  polysilicon  grids  can  be 
directly  implemented  as  a horizontal  wire  segment  as  shown  in  Figure  4.10. 
Each  set  of  interconnected  polysilicon  grids  usually  occupies  the  whole  horizontal 
routing  track  due  to  the  connection  to  its  pin  location.  The  number  of  horizontal 
routing  tracks  is  the  sum  of  the  routing  tracks  reserved  for  both  kinds  of  nets. 

Let  S2~net  be  the  number  of  two-point  nets  in  the  circuit  and  Sm_net  be  the 
number  of  the  sets  of  interconnected  polysilicon  grids.  The  estimated  number  of 
horizontal  routing  tracks  is 


$ 2-net  L/j  nr  /d 


+ Sm- 


m—net 


(-1-32) 


where  nr/4  is  the  weight  factor  of  the  extra  horizontal  routing  detours  caused  by 
the  vertical  routing  obstacles.  When  the  number  of  device  rows  increases,  the  total 
length  of  horizontal  routing  detours  will  also  increase  because  there  are  more 
blocking  device  terminals.  The  number  of  nr/ 4 is  always  rounded  to  its  closest 
integer  number.  When  nr  is  smaller  than  4,  the  weight  factor  is  rounded  to  1. 
For  a long-shaped  layout,  the  number  of  Sm_net  is  small  but  the  number  of 
S 2-net  1S  lai'Se-  Since  the  vertical  routing  distance  is  longer  for  this  kind  of  lay- 
out structure,  the  average  wire  length  for  each  two-point  net  is  expected  to  be 
higher  because  of  routing  detours.  For  a wide-shape  layout,  the  situation  is 

reversed.  The  complete  discussion  of  those  routing  situations  will  be  covered  in 
Chapter  VI. 

In  Gate  Matrix  layout,  there  is  only  one  reserved  grid  between  each  two 
adjacent  polysilicon  columns.  Consequently,  any  two  adjacent  devices  will  be 
abutted.  In  case  two  devices,  which  do  not  have  any  common  terminal,  need 
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to  be  placed  in  adjacent  locations,  a grid  column  has  to  be  inserted  to 
separate  those  two  devices.  Therefore,  the  number  of  inserted  vertical  routing 
tracks  is  related  to  the  number  of  different  net  segments  which  have  to  be  placed 
at  the  same  device  row.  The  number  nv  is  increased  whenever  more  nets  have  to 
be  squeezed  into  fewer  number  of  device  rows 

2 —net  ”b  m~-net  ^ r 

nv== 

(4-33) 


4.2.5  Determination  of  the  Grid  Structure 

The  constraints  from  both  chip  floor  plan  and  circuit  itself  represent  a 
set  of  equality  and  inequality  constraints.  Those  constraints  form  a set  of 
boundaries  for  the  feasible  solutions  of  (nc,nt,nr,nv)  . Combined  with  the  objec- 
tive function  formulated  in  section  4.1.2,  the  complete  problem  for  determining 
the  grid  structure  can  be  described  as 

minimize 


y = RA  — aW(H  - Hb ) + 0H{  W - Wb ) 
- c0(H  - ff„ )( IV  - IV, ) 


(4-3-1) 


subject  to 


(4-35) 


a = 1 if  H > Hb  ; a = 0 if  H < Hb  ; 
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/?  = 1 if  VK  > Wb  ;■/?  = 0 if  VK  < Wb  ; 


(4-36) 


\V  — Snc  + 6 nv 

(4-37) 


H = 6 (nr  + nt ) 

(4-38) 


Ng  < nc  < Nd 

(4-39) 


2 < nr  < MAX  (Ndi ) 

(4-40) 


+ WS 


(4-41) 


WS-E  [M4X(^,.)-iV,.] 

i = i 

(4-42) 


*^2 —net^h^r/3  > ^ 

~~  ' m — net 

nc 

(4-43) 
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n -I-  1 


(4-44) 


nv  = 


’ 2-net 


"I”  ^ m—net 


(4-45) 


4.2.6  The  Enumeration  Method 

Because  most  of  the  equations  are  highly  interrelated,  there  is  no  known 
efficient  algorithms  to  solve  the  quadratic  programming  problem  simultaneously. 
In  this  dissertation,  we  use  an  enumerative  method  to  find  the  optimal  solution. 
The  objective  function  is  calculated  over  all  the  possible  grid  numbers.  Since  all 
the  three  grid  numbers  nr,  nt,  and  nv  are  dependent  on  nc,  the  computation  com- 
plexity is  in  the  order  of  0(nc).  Therefore,  the  computation  time  for  the 
enumerative  approach  is  within  acceptable  range.  Following  are  the  pro- 
cedures to  find  the  solution: 

(1)  Enumerate  all  the  possible  device  column  nuber  nc  by  equation  4-39. 

(2)  Find  the  corresponding  device  row  number  nr  for  each  nc  by  equations 

4-40,  4-41,  and  4-42. 

(3)  Find  the  corresponding  number  of  routing  track  nt  for  each  pair  of 

( nc,nr ) by  equations  4-43  and  4-44. 

(4)  Calculate  W and  H for  each  quadruple  (nc  ,nr  ,nt  ,nv ) by  equations  4-37 

and  4-38. 

(5)  Calculate  the  redundant  area  (RA)  for  each  quadruple  by  substituting 

the  result  in  step  4 into  equations  4-  34,  4-35,  and  4-36. 

(6)  Select  the  quadruple  (nc,nr,nt,nv)  which  has  the  minimum  RA.  If 
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several  quadruples  have  the  same  amount  of  RA,  choose  the  one  with 
the  smallest  cell  area. 

(7)  Construct  the  grid  structure  based  on  the  selected  grid  numbers. 

Following  is  an  example  which  illustrates  the  complete  enumeration  process. 
Figure  4.11  shows  the  transistor  circuit  of  a full  adder  cell.  Figure  4.12  is  the 
input  circuit  description  of  the  cell.  The  circuit  analyzer  of  the  system  summar- 
izes the  circuit  information  as  shown  in  Figure  4.13.  Figure  4.14  depicts  three 
different  cases  of  cell  boundary  required  by  the  chip  floor  planner.  Based  on  the 
enumerative  method,  all  the  quadruples  (nc,nr,nt,nv)  are  enumerated  as  shown  in 
a table  in  Figure  4.15.  All  the  related  information  W,H,  and  RA  for  each  quadru- 
ple is  also  included  in  the  table.  In  case  one,  a long  shape  cell  configuration, 
which  has  the  grid  number  of  (5, 8, 8,1),  is  selected.  In  case  two,  a square-shape  cell 
configuration  with  the  grid  number  of  (7, 6, 8, 2)  is  selected.  In  case  three,  a 
wide-shape  cell  configuration  with  the  grid  number  of  (14,2,5,4)  is  selected.  All 
the  generated  grid  structures  are  shown  in  Figure  4.16. 
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V00  Vqo 


Figure  4.11  Transistor  Circuit  of  a Full  Adder  Cell. 


Summary  of  Circuit  Information 

Number  of  transistors:  28 

Number  of  gate  signals:  5 

Devices  distribution  on  gate  signals:  (8,8, 6, 4,2) 

inter-net  connections:  10 

N u mb e r of  mu lti-point  gate  nets.  2 


Figure  4.13  Computer  Generated  Summary  of  the  Circuit  Information. 
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(14  ((source  n5)  (drain  n4)  (drain  n3 ) ) ) 

(15  ((source  nl ) (drain  n2))) 
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(p_type  (pi  p2  p3  p4  p5  p6  p7  p8  p9  plO 
pll  pl2  pi 3 pi 4 ) ) 

( n_type  (nl  n2  n3  n4  n5  n6  nl  n8  n9  nlO 
nil  nl2  nl3  nl4 )))))) 


Figure  4.12  Computer  Language  Description  of  the  Full  Adder 
Circuit. 
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Figure  4.14  Three  Different  Cell  Boundary  Conditions, 
a)  Case  1;  b)  Case  2;  c)  Case  3. 


64 


nc 

nr 

nt 

nv 

V 

H 

5 

3 

8 

1 

46 

96 

6 

8 

9 

1 

54 

102 

7 

6 

a 

2 

68 

34 

a 

4 

7 

3 

82 

66 

9 

4 

7 

3 

90 

66 

10 

4 

7 

3 

98 

66 

11 

4 

7 

3 

106 

66 

12 

4 

7 

3 

114 

66 

13 

4 

7 

3 

122 

66 

14 

2 

5 

4 

136 

42 

cell 

area 

case  1 
RA 

case  2 
RA 

case  3 
RA 

4416 

736* 

966 

2116 

5508 

1188 

1458 

2808 

5712 

932 

612* 

2312 

5412 

1452 

792 

1312 

5940 

1980 

1320 

1440 

6468 

2508 

1848 

1568 

6996 

3036 

2376 

1696 

7524 

3564 

2904 

1824 

8052 

4092 

3432 

1952 

5712 

3192 

2772 

252* 

Figure  4.15  Computer  Generated  Result  of  the  Enumeration  Process. 


(a) 


(b) 


(c) 

Figure  4.16  Three  Different  Grid  Structures  Generated  for  Three 

Different  Cell  Boundary  Conditions  Shown  in  Figure  4.14. 
a)  For  Case  1;  b)  For  Case  2;  c)  For  Case  3. 


CHAPTER  V 
DEVICE  PLACEMENT 


After  the  grid  structure  for  the  cell  layout  environment  is  deter- 
mined, the  task  of  the  cell  layout  process  is  to  determine  the  locations  of 
the  layout  primitives  on  the  grids.  In  order  to  reduce  the  computation  com- 
plexity, the  cell  layout  process  in  this  dissertation  is  divided  into  two  sequen- 
tial tasks:  placement  and  routing.  Device  placement  is  the  task  to  determine  the 
grid  locations  of  all  the  device  terminals.  The  routing  process,  which  will  be  dis- 
cussed in  the  next  chapter,  is  the  task  to  realize  the  circuit  connectivity,  with 
wire  segments,  among  the  device  terminals. 

Three  kinds  of  grid  locations  are  to  be  determined  in  the  device  placement 
process:  horizontal  grid  locations  of  the  devices,  vertical  grid  locations  of  the 
devices,  and  the  relative  grid  locations  of  the  two  device  terminals  (source  and 
drain)  of  each  device.  Accordingly,  the  device  placement  problem  is  divided  into 
three  assignment  tasks.  The  first  section  illustrates  the  method  to  determine 
the  horizontal  grid  locations  of  the  devices.  This  problem  is  formulated  as  a 
quadratic  assignment  problem  of  determining  the  permutation  order  of 
the  device  columns.  The  quadratic  assignment  problem  is  then  solved  by 
heuristic  clustering  and  linear  placement  techniques.  The  second  section  presents 
the  technique  to  determine  the  orientation  of  each  device.  The  grid  locations  of 
all  the  device  terminals  will  be  determined  after  both  the  device  location  and 
device  orientation  are  determined.  Emphasis  on  the  algorithms  is  to  optimize 
the  cost  functions  such  as  transistor  abutment,  horizontal  wiring  length,  and  vert- 
ical wiring  length.  Finally,  the  last  section  addresses  the  technique  to  determine 
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the  vertical  grid  location  of  each  device.  The  algorithms  emphasize  reducing  the 
total  vertical  wiring  length  and  the  total  device  area.  A vertical  constraint  graph 
based  approach  is  used  to  determine  the  vertical  grid  locations  of  the  devices 
which  cause  the  minimum  number  of  vertical  wiring  obstructions.  Consequently, 
the  total  vertical  wiring  length  will  be  minimized. 

5.1  Device  Column  Order  Assignment 

In  the  Gate  Matrix  Layout  structure  each  device  is  created  by  running  a 
diffusion  bar  over  a polysilicon  strip.  The  polysilicon  strip  serves  as  the  gate 
signal  of  the  device.  The  two  separated  segments  of  the  diffusion  bar  are  used 
as  the  source  and  drain  terminals  of  the  device.  Figure  5.1  shows  the  physi- 
cal implementation  of  a device.  Each  gate  signal  is  realized  as  either  one  or 
several  interconnected  vertical  grids.  Each  vertical  grid  serves  as  a common  site 
for  a set  of  devices  (transistors)  and  is  called  a gate  column  or  device  column. 
Let  G’  — { </  , | (/  ,■  is  a device  column}  be  a set  of  device  columns  and  each  dev- 
ice column  be  composed  of  a set  of  devices,  (f  ,•  = { tf  n,  if  ,-2» » ^ ik  }•  Figure  5.2 

shows  the  gate  column  and  device  structure.  We  can  see  from  the  figure  that  all 
the  devices  which  are  located  on  the  same  gate  column  should  have  the  same 
horizontal  grid  location  as  that  gate  column.  In  other  words,  the  horizontal  grid 
locations  of  the  devices  will  be  decided  once  the  grid  locations  of  their  gate 
columns  are  determined.  Consequently,  the  problem  of  determining  the  horizontal 
grid  locations  of  the  devices  can  be  divided  into  two  subproblems.  The  first  one 
is  to  convert  the  set  of  gate  signals  G into  a number  of  gate  columns  (device 
columns)  G’.  The  transistors  which  belong  to  the  same  original  gate  signal 
will  also  be  partitioned  and  assigned  to  several  different  device  columns 
(/  j ,</  k ,...., (f  i , if  the  original  gate  signal  is  realized  as  several  device  columns. 
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Figure  5.1  Three  Different  Representations  of  a Device,  a)  Circuit 
Representation;  b)  Grid  Representation;  c)  Physical 
Representation. 
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Figure  5.2  Device  Locations  on  a Grid  Structure,  a)  A Sample 
Circuit;  b)  Its  Corresponding  Grid  Representation. 
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The  second  problem  is  then  to  determine  an  optimal  permutation  order  for 
the  device  columns  which  results  in  the  minimum  total  wiring  length  among 
the  devices  and  signals. 

5.1.1  Gate  Splitting 

Gate  splitting  is  the  process  to  convert  the  set  of  gate  signals  into  a desired 
number  of  device  columns.  Let  G — { <7,-  | g is  a gate  signal  } be  a set  of  gate 
signals  of  the  circuit,  and  each  gate  signal  g{  be  composed  of  a set  of  devices  g{  — 
{ }.  Let  Ng  be  the  total- number  of  gate  signals.  Let  G'  = { </  ■ \ 

</  j is  a device  column  } be  a set  of  device  columns  and  each  device  column  </  , be 
composed  of  a set  of  devices  ■ = { t!  gl,f  g2,....,lf  gk  }■  Let  Ng  and  N g denote 
the  total  number  of  gate  signals  and  device  columns  respectively.  Then  the  Gate 
splitting  process  is  to  find  the  assignment  function 

f 7 : G * G 

such  that 

(1)  The  number  of  the  device  columns  iV  g is  equal  to  the  number  assigned 

by  the  circuit  boundary  analysis  process. 

(2)  The  number  of  total  device  rows  nr  is  minimized. 

Given  the  equations  described  in  Chapter  IV, 


Nd  + WS 


(4-29) 
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WS  = E [Ndi  — MAX  (Ndi )] 

i = i 

(4-30) 


where  nc  and  nr  denote  the  number  of  device  columns  and  device  rows  which  are 
assigned  during  the  cell  boundary  analysis  process.  Whenever  there  are  more 
unused  grid  locations  (WS),  which  are  caused  by  uneven  distribution  of  devices, 
there  will  be  more  device  rows  required.  Consequently,  the  Gate  splitting  pro- 
cess will  emphasize  reducing  the  number  of  unused  grid  locations.  From  equation 
4-30,  it  is  obvious  that  WS  can  be  minimized  by  minimizing  the  maximum 
number  of  Ndi,  i = 0 to  Ng.  The  procedures  of  gate  splitting  are  described  as  fol- 
lows: 

(1)  Assign  the  initial  device  column  number  N g — Ng. 

(2)  Select  a gate  signal  gk  where  Ndk  = MAX(  Ndi  ),  i = 1 to  Ng. 

(3)  Split  the  gate  signal  gk  into  two  device  columns,  </  k and  g"  k.  The 

devices  in  gate  signal  gk  are  partitioned  into  two  subgroups.  Since  the 
numbers  of  n devices  and  p devices  are  equal  within  each  device 
column,  the  number  of  devices  in  </  k and  g"  k will  both  be  even 
numbers.  Therefore,  if  Ndk  of  the  original  gate  signal  gk  has  a divisor 
of  4,  then 


A*  dk  — A/1  — 


N, 


dk 


dk 


(5-1) 


or  else 
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*dk 


Ndk  +2 

9 


(5-2) 


N 


Ndk  ~ 2 


dk 


(5-3) 


(4)  Increase  the  number  of  N by  1 and  check  if  it  is  equal  to  the  device 

column  number  nc . 

(5)  If  A f g = nc , stop.  Otherwise,  go  to  step  2. 

5.1.2  Formulation  of  the  Device  Column  Order  Assignment  Problem 

The  problem  of  device  column  order  assignment  can  be  considered  as  a 
classical  Quadratic  Assignment  problem  [47-49].  This  problem  is  a generaliza- 
tion of  the  assignment  problem  in  the  following  sense.  Given  n people 
(device  columns)  and  a matrix  [ c,;-  ],  where  ctJ  is  a measure  of  the  affinity  (wires) 
between  the  two  people  i and  j.  Also  are  given  n possible  offices  (slots)  for 
these  people  and  a matrix  [ dkl  ],  where  dkl  is  the  distance  between  office  k and 
1.  If  i is  assigned  to  office  p(i)  and  j is  assigned  to  office  p(j),  then  the  cost  of 
the  assignment  is  taken  to  be  c,y ^p(i')p(;)-  We  now  take  as  our  measure  of  total 
cost  the  sum  of  cij  ^v{i)p[j)  costs  over  all  i and  j,  and  seek  the  assignment  which 
yields  the  smallest  total  cost.  The  problem  can  be  formally  described  as  follows 
[Problem  - Quadratic  Assignment]  Given  a cost  matrix  M = [ c,-;-  ],  and  a dis- 
tance matrix  D — \ dkl  ],  find  a permutation  P of  i,j  such  that 
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^ Yjcijdp{i)p{j) 
ij 

(5-4) 

is  minimized. 

Now  we  formulate  the  device  column  order  assignment  problem.  Let  T = { 
£,■  | ti  is  a transistor  } be  a set  of  transistors  of  the  input  circuit.  Let  G’  = { </  ,•  | 
g ' { is  a device  column  } be  a set  of  device  columns  generated  by  the  gate  splitting 
process.  Given  a device  ti  , let  S(£t)  denote  a set  of  signals  associated  with  the 
drain  and  source  terminals  of  device  £,•  . For  any  device  column  </  let 

SW  i)=  u 5(*.0 

Utg1 , 

(5-5) 


be  the  set  of  signals  associated  with  the  devices  which  are  located  on  device 
column  (/  ,•  . The  connections  (cost  function)  between  two  device  columns,  (/  ,•  and 
( i j , are  then  defined  by  the  common  signals  of  the  two  device  columns 


sw.oryW;) 


(5-6) 


Let  C = {l ,2, — , nc  } be  a set  of  column  numbers.  If  the  device  columns  </  • 
and  (j  j are  to  be  assigned  column  numbers  k and  1,  then  the  total  horizontal  wir- 
ing length  of  this  single  assignment  is  taken  to  be  St](k—l ) . The  distance  matrix 
is  therefore  defined  as  D = [ dkl  ] where  dkl  = [k  - l|.  Now  the  device  column 
order  assignment  problem  can  be  formally  described  as 

[Problem  - Device  Column  Order  Assignment]  Given  a connectivity  func- 
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tion  S = [ 5,-;-  ] and  a distance  function  D = [ ],  find  a permutation  order 

among  the  device  columns  G = { </  j,</  2 such  that  the  total  horizontal  wir- 
ing length 


HW  = E50^p(i)p(;) 
ij 

(5-7) 


is  minimized,  where  p(i)  denotes  the  column  order  assignment  of  ith  device  column 
Figure  5.3  depicts  the  problem  of  device  column  order  assignment.  The 
device  columns  are  represented  as  interconnected  modules.  The  modules  are 
then  to  be  linearly  placed  in  an  equivalent  number  of  slots  (columns)  with 
an  emphasis  to  minimize  the  total  cost  function. 

5.1.3  Clustering  Algorithm 

The  quadratic  assignment  problem  has  been  proved  to  be  NP-complete  [47]. 
Several  researchers  [47-49]  have  proposed  heuristic  algorithms  in  an  effort  to 
generate  near-optimal  results.  In  1972,  Shuler  and  Ulrich  proposed  a clustering 
algorithm  [50]  to  solve  the  quadratic  assignment  problem.  In  this  dissertation, 
an  algorithm  similar  to  Shuler  and  Ulrich’s  approach  has  been  adopted  to 
solve  the  device  column  order  assignment  problem  for  the  following  reasons: 

(1)  For  device  placement  problems,  the  cost  function  contains  more 

than  just  the  connection  and  distance  function.  Transistor  abut- 
ment should  also  be  considered.  The  clustering  algorithm 
has  the  capability  to  handle  combined  cost  function. 

(2)  The  clustering  algorithm  is  free  to  choose  any  signal  as  the  seed  sig- 
nal so  as  to  match  the  cell  boundary  signal  requirement. 
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A B C D E F 


(a) 


(c) 


Figure  5.3  The  Device  Column  Order  Assignment  Problem,  a)  Grid 

Representation  of  a Circuit;  b)  All  the  Device  Columns 
Are  Represented  as  a Set  of  Interconnected  Modules.  Each 
Module  (Circle)  Represents  a Device  Column;  c)  Available 
Sites  for  the  Placement  of  the  Device  Columns. 
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(3)  The  algorithm  is  easy  to  implement  and  the  computation  time  is  short. 

The  Clustering  technique  was  first  implemented  by  Shuler  and  Ulrich  for 
the  linear  placement  of  a set  of  interconnected  modules.  The  idea  of  cluster- 
ing is  to  analyze  a set  of  interconnected  modules  and  form  groups  that  have 
strong  internal  connections  but  weak  external  connections  to  other  groups. 
Because  the  strongly  connected  modules  are  placed  together  while  the  far 
apart  modules  have  weak  interconnections,  the  total  cost  is  expected  to  be 
optimized.  Figure  5.4  shows  several  different  permutations  for  the  placement 
of  three  modules.  Assume  that  there  are  5 connections  between  modules  A 
and  B,  3 between  B and  C,  and  2 between  A and  C.  In  case  1,  the  strongly  con- 
nected modules  A and  B,  B and  C are  placed  together  while  A and  C are  placed 
apart.  If  we  assume  that  the  distance  between  each  pair  of  adjacent  modules  is 
1.  Then  the  total  cost  (wiring  length)  for  case  1 is  12.  In  case  2,  modules  A and 
B are  still  placed  together  but  modules  B and  C are  now  placed  apart.  Since  SBC 
is  greater  then  Sac  , the  longer  distance  between  B and  C increases  the  cost  to  13. 
In  case  3,  the  situation  is  even  worse  because  the  strongly  connected  modules  A 
and  B are  now  placed  apart.  The  total  cost  in  this  case  is  15. 

The  clustering  value.  Clustering  is  done  in  a pairwise  fashion.  A clustering 
value  is  calculated  for  each  pair  of  interconnected  elements  where  an  element  can 
be  either  a separate  module  or  a cluster  of  modules.  The  clustering  value  serves 
as  the  strength  measure  for  the  interconnection  relationship  among  all  the  ele- 
ments. The  value  should  be  a function  of  both  the  conjunctivity  (connec- 
tions between  i and  j)  and  disjunctivity  (connections  not  between  i and  j);  there- 
fore, even  though  the  pairs  of  two  elements  have  the  same  number  of  connec- 
tions, their  clustering  value  may  be  quite  different.  Figure  5.5  shows  the  factor 
of  disjunctivity  for  several  different  nets.  In  Figure  5.5(a)  and  5.5(b),  both  pairs 


75 


Case  1 


Figure  5.5  The  Importance  of 

Disjunctivity  in  Clustering, 
a)  The  Four  Connections  between 
A and  B Are  Significant;  b)  The 
Four  Connections  between  D and  E 
Are  Not  So  Significant;  c)  Module 
H Must  Be  Connected  with  G. 
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have  the  same  number  of  connections.  However,  if  the  elements  A and  B in  Figure 
5.5(a)  are  not  clustered,  the  total  cost  will  increase  from  7 to  10.  In  Figure 
5.5(b),  if  elements  D and  E are  not  clustered,  the  cost  will  increase  from  13  to 
14.  Therefore,  the  clustering  of  A and  B is  more  important  than  the  clus- 
tering of  D and  E.  In  Figure  5.5(c),  element  G does  not  tend  to  be  clustered  with 
element  H because  G has  five  connections  to  other  elements,  but  H has  the  only 
choice  to  be  clustered  with  element  G.  Consequently,  the  clustering  value  S VtJ  for 
elements  i and  j is  given  by  the  following  equation 


(5-8) 


SVij  = 


T,  ~ 


+ 


‘j 

Ti  ~ 


% 


where 

Sjj  = connections  between  elements  i and  j 
T,-  = total  connections  to  element  i 
Tj  — total  connections  to  element  j 

The  clustering  procedures.  After  the  clustering  value  is  calculated  for 
all  connected  pairs  of  elements,  the  pairs  with  the  greatest  CV’s  are  clustered. 
The  clustered  pairs  become  new  elements  which  are  then  eligible  for  the  next 
pair-wise  clustering.  The  process  is  continued  until  all  modules  (device  columns) 
are  drawn  into  a single  cluster  or  the  remaining  clusters  are  independent. 

The  clustering  tree.  After  the  clustering  procedure,  a binary  tree,  which  is 
called  the  clustering  tree,  is  generated.  The  clustering  tree  represents  the  topolog- 
ical relationships  among  the  modules  and  clusters.  An  example  is  given  in  Figure 
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Figure  5.6  Two  Kinds  of  Clustering  Trees,  a)  All  Modules  Are  Added  to  the 
Same  Cluster  One  by  One;  b)  Small  Clusters  Are  Created  First. 
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Figure  5.7  Four  Possible  Linear  Placement  Orders  for  the  Four-Module 
Cluster . 
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5.6.  The  binary  tree  is  a collection  of  elements  called  nodes  along  with  a rela- 
tion (parenthood)  that  places  a hierarchical  structure  on  the  nodes.  A single 
node  by  itself  is  a subtree.  The  parenthood  relationship  is  represented  by  the 
lines  downward  from  a node  n to  other  nodes  nl,n2,....,nk  . Node  n is  called  the 
parent  of  nodes  . Nodes  nvn2,....,nk  are  called  the  children  of 

node  n.  The  nodes  which  have  no  children  are  called  leaf  nodes.  The  node 
which  locates  at  the  top  level  of  the  binary  tree  (or  subtree)  is  called  the  root.  The 
level  of  a node  in  the  tree  is  the  length  of  a longest  path  from  the  root  to  the 
node.  As  shown  in  Figure  5.6,  each  module  (device  column)  is  represented  as  a leaf 
node  at  the  bottom  level.  Whenever  two  nodes  are  clustered,  the  new  cluster  is 
represented  as  a parent  node  of  those  two  nodes.  The  clustering  process  contin- 
ues until  all  the  modules  are  clustered  as  a single  cluster  which  is  represented 
as  the  root  node.  The  topological  information  included  in  the  binary  tree  is  that 
the  two  children  of  the  same  parent  should  be  placed  adjacently.  Figure  5.7 
shows  the  possible  linear  placement  of  a cluster. 

The  factor  of  transistor  abutment.  The  clustering  value  function  discussed 
so  far  only  concerns  about  the  connections  among  the  elements.  However, 
transistor  abutment  is  also  an  important  factor  which  should  be  considered  in 
device  placement.  Figure  5-8  demonstrates  the  importance  of  transistor  abut- 
ment in  device  placement.  In  Figure  5.8(a),  an  extra  diffusion  grid  is  needed 
because  transistor  A and  B cannot  be  abutted  by  this  arrangement.  In  Figure 
5.8(b),  the  extra  diffusion  grid  is  removed  because  A and  B can  share  the  same 
diffusion  area. 

Now  we  try  to  define  a clustering  value  which  also  considers  the  factor  of 
transistor  abutment.  Let  M,-  and  My  denote  two  clusters  which  are  represented  by 
two  subtrees  T l and  V 2 respectively.  Let  mtp  denote  the  pth  module  contained 
in  cluster  M,-  , and  m • denote  the  qth  module  contained  in  cluster  A/-  . Let 

jy  j cpjq 


I J2 
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(a)  (b) 

Figure  5.8  Importance  of  Transistor  Abutment,  a)  Two  Devices  Are  Not  Abutted; 

b)  Abutment  Reduces  the  Number  of  Grid  Columns  and  Connections. 


b b a cin  c0lJ-t 


Figure  5.9  Assignment  of  the  Device  Column  Orders  of  the  Circuit  in 
Figure  3.5.  a)  Graph  Representation  of  the  Device  Columns; 
b)  The  Generated  Clustering  Tree;  c)  Result  of  Linear 
Placement;  d)  The  Final  Device  Column  Orders. 


80 


denote  the  number  of  connections  between  module  mip  and  m;?  . Let  lip  denote 
the  level  of  module  mip  within  the  subtree  Tx  . Now  the  expected  number  of 
transistor  abutment  between  cluster  M,  and  M:  can  be  represented  as 


m n 

ABT{j  = E E 

p-i?* 


■’tpjq 


i 2l'r2lJ, 


(5-9) 


By  adding  the  factor  of  equation  5-9  into  equation  5-8  yields 


SVtJ 


Tt  - Stj 


Ti  ~ Sn 


m m 


+ E E 


> PJQ 


2'”  2'" 


(5-10) 


Linear  placement algorithm.  The  linear  placement  algorithm  uses  the 

binary  tree  generated  by  the  clustering  algorithm.  For  a one-dimensional  row, 
any  two  adjacent  clusters  can  shift  positions.  Therefore,  the  number  of  pos- 
sible linear  arrangement  is 


NP  = 2n_1 


(5-11) 


If  two  symmetrical  placements  are  considered  as  equal,  the  number  is  only  half 


NP  =2n~2 


(5-12) 
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This  number  is  much  smaller  than  the  total  possible  linear  placement 
arrangement  without  binary  tree  structure  which  is 

NP  = n!/2 

(5-13) 

Shuler  [50]  proposed  two  possible  binary-tree  shuffling  methods  to  perform 
the  linear  placement  of  the  clustering  result.  Although  he  proves  that  the 
top-down  linear  placement,  which  shuffles  from  the  highest  level  of  nodes  of  the 
binary  tree,  achieves  a better  result  (with  shorter  total  wiring  length),  it  is 
not  appropriate  for  gate  column  ordering.  This  method  neglects  all  the  transis- 
tor abutment  factors  and  causes  poor  result.  The  method  we  adopt  here  is  a 
revised  bottom-up  method  which  considers  the  factor  of  transistor  abutments 
during  binary  tree  shuffling.  The  solution  is  to  proceed  from  the  most  basic 
subtrees  (a  pair  of  nodes)  and  always  link  two  already  linearly  placed  subtrees 
into  a single  linear  placement.  Consider  the  two  subtrees  which  are  arranged 
linearly  with  m and  n nodes  respectively  in  each  subtree: 

(l,2,....,m)  (1,2 ,n) 

Since  the  two  linear  placements  cannot  be  intermixed  by  the  hypothesis,  there 
are  only  four  ways  the  two  linear  placements  can  be  linked: 


(1,2,. 

-,m) 

(1,2, 

n) 

(1,2, 

...,m) 

(n„.. 

.,2,1) 

(m,.. 

,2.1) 

(1,2, 

— ,n) 

(m,.. 

.,2,1) 

(n,.. 

,2,1) 
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The  linear  placement  process  then  starts  from  a leaf  node.  Each  time  the 
clustering  value  of  all  four  different  linear  placements  between  two  adjacent 
clusters  are  calculated.  The  placement  with  the  maximum  clustering  value  is 
selected  and  the  algorithm  proceeds  until  all  the  modules  have  been  added  to  the 
linear  placement.  Figure  5.9  shows  the  device  placement  result  for  the  "compare 
circuit"  shown  in  Figure  3.5. 

5.2  Device  Orientation  Assignment, 

In  section  5.1,  we  present  a technique  to  determine  the  horizontal  grid  loca- 
tions of  the  devices.  However,  the  horizontal  grid  locations  of  the  device  ter- 
minals are  still  not  fixed  because  the  drain  and  source  terminals  of  each  dev- 
ice are  free  to  switch  positions.  Devices  which  have  different  source/drain 
positions  are  termed  to  have  ’different  orientations’.  Let  us  call  that  a device  is 
east  oriented  if  the  drain  terminal  is  to  the  east  of  the  source  terminal  and  vice 
versa.  Because  a net  is  interconnected  through  device  terminals,  different 
arrangements  of  device  orientations  can  cause  much  different  wiring  results.  The 
task  of  Device  Orientation  Assignment  is  to  find  an  optimal  set  of  device 
orientations  such  that  the  total  wiring  length  of  the  cell  layout  is  minimized. 

The  computational  complexity  for  exhaustively  searching  an  optimal  set  of 
device  orientations  is  0( 2N)  where  N is  the  total  number  of  devices.  However, 
such  an  exhaustive  search  is  meaningless  because  the  vertical  wiring  length  is 
still  unknown  at  this  stage.  In  this  dissertation,  we  use  a net  processing  technique 
to  decide  the  orientations  of  some  of  the  devices.  The  orientations  of  the  other 
devices  are  then  determined  by  heuristic  algorithms  which  emphasize  reducing 
the  horizontal  wiring  length  and  vertical  wiring  length. 

Net  processing.  Each  net  in  the  circuit  is  defined  as  an  electrical  common 
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point  for  several  transistor  terminals.  Since  all  the  transistors  in  this  net  are  to 
be  interconnected  by  wire  segments,  it  is  advantageous  to  place  all  those 
transistors  at  the  same  device  row.  In  that  case,  the  interconnections  usually 
can  be  implemented  as  a single  horizontal  wire  segment.  However,  as  each 
device  contains  two  terminals  which  may  belong  to  two  different  nets,  the 
transformation  from  the  electrical  nets  into  net  segments  is  not  direct. 

Net  splitting.  In  case  a net  in  the  circuit  can  not  be  assigned  as  a single 
horizontal  line  segment,  it  should  be  split.  This  situation  happens  if  two 
transistors  in  the  net  are  located  at  the  same  column  or  three  transistors  in 
the  net  are  located  at  adjacent  columns.  Figure  5.10(a)  shows  an  example 
of  net  splitting. 

Net  merging.  This  procedure  will  increase  the  number  of  transistor  abut- 
ments. Two  nets  will  be  merged  as  a single  horizontal  line  segment  if  these  two 
nets  have  a transistor  in  common  and  they  are  located  at  different  sides  as 
shown  in  Figure  5.10(b). 

5.2.2  Heuristic  Algorithms 

The  device  orientation  assignment  process  utilizes  several  circuit 
heuristics  to  determine  the  device  orientations.  The  circuit  heuristics, 
sorted  in  priority  order,  are  described  as  follows: 

(1)  Maximizing  the  number  of  transistor  abutments. 

(2)  Maximizing  the  number  of  vertical  wiring  alignments. 

(3)  Minimizing  the  number  of  vertical  obstructions. 

(4)  Minimizing  the  total  horizontal  wiring  length. 

All  these  circuit  heuristics  are  described  in  the  following  paragraphs. 
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(b) 


Figure  5.10  Two  Tactics  of  Net  Processing,  a)  Net  Splitting; 
b)  Net  Merging. 
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Maximizing  the  number  of  transistor  abutments.  Transistor  abutment  in  cell 
layout  not  only  reduces  the  device  wiring  length  but  also  reduces  the  device 
area.  Consequently,  it  is  the  most  important  factor  to  be  considered. 
There  are  two  essential  conditions  for  two  transistors  A and  B to  be  abutted: 

(1)  A and  B must  be  placed  at  adjacent  columns  (or  rows). 

(2)  A and  B must  have  at  least  one  connection. 

(3)  A and  B must  be  properly  oriented  such  that  the  interconnected 
terminals  of  A and  B are  at  adjacent  locations. 

Figure  5.11(a)  shows  that  transistors  Tl,  T2,  and  T3  can  be  abutted  by 
proper  arrangement  of  those  three  transistors. 

Maximizing  the  number  of  vertical  wiring  alignments.  If  two  transistors  are 
located  at  adjacent  device  rows,  it  is  also  advantageous  to  arrange  the  device 
orientations  such  that  the  interconnected  terminals  locate  adjacently.  In  such 
case,  the  two  transistors  can  be  either  vertically  abutted  or  connected  by  a 
vertical  diffusion  run.  This  is  illustrated  in  Figure  5.11(b). 

Minimizing  the  number  of  vertical  obstructions.  If  a vertical  connection 
between  any  two  device  terminals  is  blocked  by  an  obstacle,  it  requires  extra 
wiring  detour  to  bypass  the  obstacle.  As  shown  in  Figure  5.11(c),  the  transistor 
should  be  properly  oriented,  if  possible,  to  avoid  vertical  obstruction. 

Minimizing  the  total  horizontal  routing  length.  Different  orientation  assign- 
ments among  two  horizontally  connected  devices  usually  cause  different  wiring 
lengths.  As  shown  in  Figure  5.11(d),  we  can  see  that  if  a device  is  right  con- 
nected, it  is  better  to  place  the  interconnected  terminal  at  right.  Otherwise,  it 
should  be  placed  at  left.  If  a device  has  both  right  and  left  connections,  we  con- 
sider it  as  right  (left)  connected  if  the  number  of  right  (left)  connections  is  greater 
than  the  left  (right)  connections. 
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Figure  5.11  Heuristic  of  Device  Orientation  Assignment,  a) 

By  Transistor  Abutment;  b)  By  Vertical  Alignment; 
c)  By  Vertical  Obstruction;  d)  By  Horizontal 
Wiring  Length. 
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1L3  Device  Row  Roder  Assignment, 

After  all  the  device  orientations  are  determined,  the  last  step  for  device 
placement  is  to  determine  the  vertical  grid  locations  (row  orders)  of  the  devices. 
The  permutation  number  for  the  exhaustive  search  of  optimal  device  row  orders 
is  also  very  huge.  We  assume  that  there  are  n columns  for  a cell  layout.  Each 
column  has  m device  rows.  Let  Ndi  denote  the  number  of  devices  in  ith  device 
column.  Since  the  devices  can  be  permuted  only  within  each  device  column, 
the  total  permutation  for  the  device  row  orders  is 

PN=  £c(m,Ndi)Ndi\ 

»-i 

(5-14) 

Fortunately,  the  computational  complexity  has  been  reduced  tremendously 
by  the  net  processing  technique  described  in  the  previous  section.  After  the  net 
processing,  the  devices  are  converted  into  a set  of  horizontal  wire  segments 
with  possible  vertical  connections.  The  device  placement  problem  is,  therefore, 
converted  into  the  wire  segment  placement  problem.  Since  the  number  of  horizon- 
tal wire  segments  derived  from  the  net  processing  procedures  is  far  less  than  the 
total  number  of  devices,  the  computational  complexity  is  considerably  reduced. 

Placement  of  a set  of  horizontal  wire  segments  is  a classical  channel  rout- 
ing problem.  The  algorithm  proposed  by  Hashimoto  and  Stevens  [51]  was  proved 
to  be  optimal.  However,  this  algorithm  did  not  consider  the  vertical  connec- 
tions associated  with  the  wire  segments.  In  this  dissertation,  we  used  a con- 
straint graph-based  approach  to  handle  the  wire  segment  placement  problem 
with  vertical  connections. 
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The  (vertical)  constraint  graph  is  defined  as  a directed  graph  which  impli- 
citly determines  the  placement  order  of  the  horizontal  wire  segments.  The  con- 
straint graph  CG  consists  of  a set  of  vertices  V = { v^,v^  a set  of  edges  E = 
{ C>e2  ,.••},  and  a mapping  ^ that  maps  every  edge  onto  some  ordered  pair  of 
vertices  (vj,Vj).  The  vertex  vx  , which  the  edge  ek  is  incident  out  of,  is  called  the 
initial  vertex.  The  vertex  Vj  , which  the  edge  ek  is  incident  into,  is  called  the  ter- 
minal vertex  of  . A directed  edge  (vx  ,vj ) is  represented  by  a line  segment 
between  vi  and  Vj  with  an  arrow  directed  from  v{  to  Vj  . Each  horizontal  wire 
segment  to  be  placed  is  represented  as  a vertex  in  the  digraph.  A directed  edge  (vt 
, Vj)  indicates  that  horizontal  wire  segment  v{  should  be  placed  above  Vj  . 

Constraint  graph  and  binary  relations.  Due  to  the  wiring  considerations,  a 
binary  relation  R between  vertex  pairs  (vx  , Vj)  always  exists.  We  write 

Vj  R Vj 

and  say  that  v{  has  relation  to  Vj  . There  are  three  kinds  of  binary  relations 
between  the  vertex  pairs.  Figure  5.12  illustrates  all  the  binary  relations.  In 
Figure  5.12(a),  both  of  nets  A and  B have  a vertical,  upward  connection.  How- 
ever, the  vertical  connection  of  B has  the  same  location  of  a device  terminal 
in  A.  In  this  case,  B is  vertically  obstructed  by  A.  It  is  obvious  that  whenever 
B is  vertically  obstructed  by  A,  B should  be  placed  above  A.  Otherwise,  the 
vertical  connection  of  B will  need  extra  wiring  detour  to  bypass  the  blocking 
obstacle.  Consequently,  we  define  ABOVE  and  BELOW  relations  between  a 
pair  of  vertices  ( vx,vj  ) with  vx  ABOVE  Vj  indicates  that  vt  should  be  placed 
above  Vj  and  vice  versa.  Another  relation  vt-  UNREL  Vj  indicates  that  Vj  and  Vj 
are  not  blocking  each  other  and  have  placement  constraint  between  them.  Figure 
5.12(b)  shows  that  nets  C and  D can  be  placed  either  way  because  they  have  no 
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(a) 


(b) 


Figure  5.12  Binary  Relations  Between  Two  Nets,  a)  Net  B is 

ABOVE  Net  B;  b)  Net  C and  Net  D are  not  Related. 
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wiring  constraints. 

Construction  of  the  vertical  constraint  graph.  Since  each  net  is  composed 
of  a set  of  horizontally  interconnected  devices,  the  binary  relation  R between  any 
two  nets  A and  B is  determined  by  the  binary  relations  among  the  devices  in  A 
and  B.  Let  R’  denote  the  binary  relation  between  two  devices  tt  and  t]  . The 
definition  of  binary  relations  ABOVE  , BELOW  and  UNREL  between  two  nets  is 
also  applied  to  the  device  relations.  Let  tj  ABOVE  tj  indicate  that  device  ti  must 
be  placed  above  tj  because  of  vertical  obstruction.  Suppose  that  net  to,-  contains  a 
set  of  devices  to,-  = { tivti2,  . . . , tik  } and  that  net  rij  contains  a set  of  devices 
nj  — { ■ ■ ■ , tji  }.  Let  us  define  a special  binary  relation  ALIGN  between 

two  devices.  In  which  case,  tt  ALIGN  tj  indicates  that  device  tt  must  be  placed 
at  the  same  row  as  tj  . Since  all  the  devices  in  the  same  net  will  be  placed  at  the 
same  row,  the  following  binary  relations  hold  for  net  to,-  and  net  n]  : 


tn  ALIGN  t;  2 
tj  2 ALIGN  tj 3 


tik- 1 ALIGN  tik 
tj  | ALIGN  tjo 
tj2  ALIGN  tj3 


tji — i ALIGN  tji 


The  ALIGN  relation  has  the  property  that  if  for  any  three  transistors  tt,  tj. 
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t 


k > 


/;  ALIGN  tj  and  £,•  /?  tk 


always  implies 


tj  R t 


k 


where  R can  be  either  one  of  the  four  binary  relations  described  above.  All 
four  binary  relations  also  have  the  transitivity  property  such  that, 


ti  R tj  and  tj  R tk 


always  imply 


£,  R tk 

Now  let  us  describe  how  to  transform  devices’  binary  relations  into  nets’ 
binary  relations.  Let  tiken,j  and  tjt e rij  . Then  the  binary  relations  ABOVE  and 
BELOW  have  the  following  properties: 

IF  tik  ABOVE  tjl  THEN  nt  ABOVE  ny 

IF  tik  BELOW  tjt  THEN  nt  BELOW  ny 


By  using  these  two  properties  and  the  device  relation  properties  described 
previously,  the  constraint  graph  of  the  nets  can  be  transformed  from  the 
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Figure  5.13  Netlist  Representation  of  a Circuit. 
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Figure  5.14  Vertical  Constraint 

Graph  of  the  Devices. 


Figure  5.15  The  ALIGN  Relation. 


Figure  5.16  Vertical  Constraint 
Graph  of  the  Devices 
after  the  Binary  Operations. 


Figure  5.17  Vertical  Constraint 
Graph  of  the  Nets . 
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constraint  graph  of  the  devices.  Figure  5.13  shows  an  example  of  net- 
list  representation  which  is  generated  by  the  net  processing  process.  The 
vertical  constraint  graph  of  the  devices,  which  is  shown  in  Figure  5.14,  is  con- 
structed by  the  vertical  connections  and  relative  positions  among  the 

device  terminals.  Figure  5.15  depicts  the  ALIGN  relations  among  the  net  seg- 
ments. Figure  5.16  shows  the  constraint  graph  of  the  devices  after  applying 
transitivity  properties  and  ALIGN  relation  to  the  devices.  An  equivalent  con- 
straint graph  for  the  nets  is  then  constructed  accordingly  as  shown  in  Figure 
5.17. 

Constrained  left-edge-first,  algorithm.  After  the  vertical  constraint  graph  for 
the  nets  is  constructed,  an  algorithm  called  the  constrained  left-edge-  first  is 
then  applied  to  place  the  horizontal  wire  segments.  The  algorithm  is  a 
modification  of  Hashimoto  and  Steven  [51]  and  Ivuh  et  al.  [52]  approaches. 
This  algorithm  uses  a merging  technique  to  merge  two  or  more  nets  into  a sin- 
gle net  while  observing  the  layout  constraint.  The  wire  segments  in  the  final  con- 
straint graph  after  the  merging  process  can  be  easily  placed  one  by  one  into  the 
device  rows.  The  procedures  of  the  device  row  order  assignment  can  be  summar- 
ized as: 

(1)  Construct  the  constraint  graph  of  the  devices. 

(2)  Construct  the  constraint  graph  of  the  nets  based  upon  result  of  (1). 

(3)  Sort  the  edges  in  net  segment  set  W based  upon  the  value  of  the  left 

coordinate  of  each  net  segment.  If  several  net  segments  have  the  same 
left  edge,  the  net  segment  with  the  higher  priority  in  the  constraint 
graph  will  be  placed  on  top. 

(4)  Select  the  first  net  segment  Wl  in  W and  delete  this  net  segment  from 

W. 

(5)  Find  all  the  net  segments  which  have  left  edge  coordinates  smaller  than 
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that  of  the  net  segment  W1  and  store  all  those  net  segments  into  a list 

L. 

(6)  Select  a net  segment  Wl  from  L which  (a)  is  closest  to  the  right  of  Wl 

and  (b)  minimizes  the  increase  of  the  longest  path  of  the  vertical  con- 
straint graph. 

(7)  Merge  net  segments  W1  and  . Delete  the  net  segment  from  W . 

(8)  Repeat  steps  of  (6)  and  (7)  until  no  nets  in  L can  be  merged  with  net 

WV 

(9)  Go  to  step  4 and  repeat  steps  5 to  8 until  W is  empty. 

(10)  Place  the  net  segments  one  by  one  according  to  the  priority  order 
in  the  final  merged  constraint  graph. 

The  device  row  order  assignment  procedures  are  illustrated  by  an 
example  shown  in  Figure  5.18.  Figure  5.  19  shows  the  intermediate  steps  of  the 
graph  merging  process.  The  final  implementation  of  the  net  segments,  accord- 
ing to  the  merged  constraint  graph,  is  shown  in  Figure  5.20 
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Figure  5.20  Placement  of  the  Net  Segments  According  to  the 
Vertical  Constraint  Graph  in  Figure  5.19. 


CRAP  TER  VI 

ROUTING  IN  HIGH  DENSITY  GATE  MATRIX  STRUCTURE 


After  the  device  placement  procedures,  the  circuit  representation  can  be 
transformed  from  circuit  domain  into  topological  domain  with  device  terminals 
represented  as  grid  points  and  gate  and  power  signals  represented  as  grid  lines. 
The  signal  nets,  which  are  originally  defined  as  sets  of  electrically  common 
device  terminals,  are  now  defined  as  sets  of  grid  points  and  lines  which  should  be 
connected  together.  In  this  chapter,  we  present  algorithms  which  realize  the 
circuit  connectivity  among  the  grid  points  and  lines  with  an  emphasis  to 
minimize  the  total  wiring  length  under  all  the  routing  constraints. 

Six  sections  are  included  in  this  chapter.  Section  I describes  the  general 
routing  constraints  under  the  Gate  Matrix  Layout  and  CMOS  IC  design 
environment.  Section  II  depicts  the  terminal  labeling  techniques  which  gen- 
erate important  data  structures  for  the  use  of  wire  segment  generation  and 
detailed  routing  processes.  A method  for  the  generation  of  wire  segments  is 
given  in  Section  III.  The  method  converts  a muti-point  net  into  a set  of  two- 
point  wire  segments  (wire-list)  which  has  the  minimum  total  wiring  length. 
The  minimum  spanning  tree  algorithm  is  used  to  find  the  shortest  path  which 
interconnects  the  multi-point  net.  Section  IV  presents  the  routing  techniques  to 
find  the  paths  for  the  generated  wire  segments.  The  system  combines  several 
channel  routing  techniques  to  execute  detailed  routing  for  different  routing 
schemes.  Finally,  the  last  section  describes  the  techniques  for  the  improvement 
of  cell  layout  and  for  small-scale  modification  of  cell  configuration. 
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6.1  The  Routing  Constraints  and  Assumptions 

The  routing  constraints  can  be  considered  as  a prescription  for 
preparing  the  photomasks  that  are  used  in  the  fabrication  of  integrated  cir- 
cuits. The  constraints  provide  a necessary  communication  link  between  the  cir- 
cuit design  and  fabrication  process.  All  the  constraints  are  derived  from  the 
observation  of  the  interactions  between  different  mask  layers. 

Processing  technology  ..and  layer  representation.  The  advances  in  CMOS  pro- 
cessing technology  has  resulted  in  the  complex  CMOS  processes  which  somewhat 
inhibit  the  visualization  of  the  mask  levels  in  the  actual  fabrication  process. 
Nevertheless  the  design  process  can  be  abstracted  to  a manageable  number  of  con- 
ceptual layout  levels  that  represent  the  key  features  at  the  symbolic  level.  In  this 
dissertation,  a single-metal,  twin-well,  poly-gate  CMOS  technology  is  assumed. 
Only  five  most  important  mask  layers,  which  includes  n diffusion,  p diffusion, 
polysilicon,  metal,  and  contact,  of  the  CMOS  fabrication  process  are  con- 
sidered at  the  symbolic  level.  Each  mask  layer  is  represented  as  a stipple 
pattern  in  schematic  layout  or  a special  symbol  in  computing  process.  Figure  6.1 
shows  several  frequently  used  stipple  patterns. 

Due  to  the  interactions  between  different  mask  layers,  there  are  several 
constraints  for  the  interconnection  path  within  the  cell.  Those  constraints  are 
called  the  routing  constraints.  Following  is  the  list  of  the  routing  constraints  for 
symbolic  cell  layout. 

(1)  The  metal  layer  and  polysilicon  layer  are  isolated  according  to  natural 

material  property.  Contacts  are,  therefore,  needed  for  the  the 
interconnection  between  those  two  layers.  The  same  situation 
applies  for  the  layers  of  metal  and  diffusion. 

(2)  Crossover  is  allowed  between  (a)  metal  and  diffusion  layers  (b)  metal 
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Figure  6.1  Major  Stipple  Patterns  Used  in  Schematic  Layouts. 
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and  polysilicon  layers. 

(3)  Crossover  is  not  allowed  between  (a)  two  metal  layers  (b)  metal  and 
contact  (c)  diffusion  and  contact  (d)  two  diffusion  layers  (e)  diffusion 
and  polysilicon  layers. 

The  routing  constraints  described  above  are  illustrated  by  Figure  6.2. 

Basic  assumptions  for  Gate  Matrix  routing.  Based  upon  the  described  rout- 
ing constraints,  several  assumptions  have  been  imposed  on  the  routing  process 
with  an  emphasis  to  reduce  the  potential  routing  conflicts  and  to  simplify  the 
algorithm  design. 

(1)  Polysilicon  is  used  to  realize  gate  signals.  It  runs  vertically  and  is 
separated  by  diffusion  runs. 

(2)  Wire  connections  can  be  realized  as  either  metal  or  diffusion  layer. 
Metal  is  used  for  horizontal  wire  connection.  Diffusion  is  used  for 
horizontal  wire  connection.  The  only  exception  is  that  wire  connec- 
tion at  the  left-most  side  and  right-most  column  can  be  realized 
as  metal  lines. 

(3)  The  power  net,  which  is  realized  as  a horizontal  line,  is  placed  at  the 

top  of  the  cell.  The  ground  net  is  placed  at  the  bottom  of  the  cell. 
Although  both  nets  are  represented  as  a set  of  separated  grid  points, 
all  those  points  are  considered  as  interconnected  implicitly. 
Device  terminals  which  are  connected  to  the  power  (ground)  net  can 
be  connected  to  any  grid  point  of  the  net. 

(4)  A polysilicon  gate  column  is  also  considered  as  a set  of  implicitly 

interconnected  grid  points.  Each  device  terminal  which  is  connected 
to  the  gate  column  can  be  connected  to  any  one  of  the  grid  points. 

(5)  Horizontal  (metal)  wires  can  be  placed  on  either  device  rows  or  horizon- 

tal routing  channels. 


101 


Legal  Connection 


Legal  Crossover 


Figure  6.2  Two  Basic  Layout  Rules. 
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(6)  Vertical  (diffusion)  wires  which  need  to  run  over  any  device  row 
require  a through  terminal  on  that  device  row. 

(7)  We  assume  that  unlimited  horizontal  routing  tracks  can  be  inserted 

between  two  adjacent  device  rows  or  between  a device  row  and  the 
power  (ground)  net.  The  insertion  of  the  vertical  routing  tracks,  how- 
ever, is  strictly  limited.  It  is  permitted  only  when  the  system,  after 
boundary  analysis,  suggests  to  do  so  or  channels  are  congested. 

6.2  Terminal  Labeling 

The  task  of  terminal  labeling  is  to  convert  the  information  of  circuit 
connectivity  from  the  circuit  domain  to  topological  domain.  Such  information  is 
needed  for  the  use  of  wire  segment  generation  and  detailed  routing.  Let  S == 
{Sj  | S{  is  a net}  be  a set  of  nets  which  describes  the  circuit  connectivity.  Each  net 
Si  consists  of  a set  of  device  terminals  5,  = {pn,pi2,....,pik}.  After  the  device 
placement  procedures,  the  grid  location  of  each  device  terminal  is  determined. 
Consequently,  we  can  describe  the  net  connectivity  as  a different  format. 
Let  S’  = {S'  ,■  [S'  i is  a net}  be  a set  of  nets  which  describes  the  circuit  connec- 
tivity. Then  we  can  describe  each  net  S';  as  a set  of  grid  points  S'  - = 
{ P ' ti’P7  tk  } where  p'  ik  = (xik  ,yik ),  is  the  grid  location  assigned  for  device 

terminal  pik . We  now  describe  the  terminal  labeling  process. 

Let  S = {bjjSo, — Ss  } be  the  set  of  nets.  Each  net  Sk  is  given  a net  number 
k over  integers  0,l,2,...,s  where  s is  number  of  nets.  We  then  assign  the  same  net 
number  k to  all  the  device  terminals  pik  which  is  contained  in  the  net  Sk.  Let  N 
be  the  number  of  grid  rows  and  M be  the  number  of  grid  columns  on  the  Gate 
Matrix  structure.  Consequently,  there  are  MN  grid  location  available  for  the  use 
of  device  terminals.  We  call  each  grid  location  as  a "terminal  site".  There  are 
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two  kinds  of  terminal  sites.  If  a terminal  site  is  occupied  by  a device  terminal,  it 
is  called  a "used  terminal  site"  and  is  assigned  the  net  number  of  that  device 
terminal.  Otherwise,  the  terminal  site  is  called  "unused  terminal  site"  and  is 
assigned  a number  0.  We  call  the  terminal  table  as  the  topological  net  represen- 
tation. Figure  6.3  shows  the  topological  net  representation  of  a sample  cir- 
cuit. Each  topological  net  representation  includes  two  kinds  of  key  informa- 
tion 

(1)  All  the  terminal  sites  with  the  same  net  number  represent  a 
multi-point  net  which  should  be  interconnected  by  wire  segments. 

(2)  The  distance  between  two  net  terminals  can  be  measured  as  the 

distance  between  the  two  grid  points  on  which  the  net  terminals 
locate. 

(3)  The  terminal  sites  labelled  as  "0"  represent  unused  terminal  sites.  Only 

the  unused  terminal  sites  can  be  assigned  as  through  terminals 
which  allow  the  vertical  diffusion  path  to  pass  through. 

After  the  terminal  labeling  process,  the  circuit  connectivity  is  success- 
fully transformed  into  the  topological  domain.  Each  net  & = {p'  ivp'  -0 p'  -A 

is  represented  as  a set  of  grid  points  instead  of  a set  of  device  terminals. 
The  task  of  the  routing  process  is  now  to  interconnect  all  the  grid  points 
within  a multi-point  net.  Figure  6.4  shows  the  final  routing  result  of  Figure  6.3. 

Q-bstruction  map.  Besides  the  topological  net  representation,  we  also  need  an 
obstacle  map  to  indicate  which  mask  layers  are  located  on  each  terminal  site.  This 
information  is  essential  because  the  device  rows,  which  contain  a lot  of  routing 
obstacles,  are  also  used  as  part  of  the  routing  channels.  The  obstacle  map  will 
indicate  which  parts  of  the  device  rows  can  or  cannot  be  used  as  routing  chan- 
nels. Initially,  the  obstacle  map  contains  only  the  information  of  the  device 
terminals.  As  routing  process  proceeds,  the  information  of  the  wire 
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Figure  6.3  Terminal  Labelling  of  a Sample  Circuit. 


Figure  6.4  Final  Routing  Result  of  Figure  6.3. 
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segments  and  contacts  will  be  gradually  added  to  the  map.  The  obstacle  map 
indicates  what  kind  of  layers  are  located  on  each  terminal  site  and  implicitly 
indicates  which  kinds  of  wire  segments  or  contacts  are  not  allowed  to  passed 
through  or  placed  on  that  terminal  site.  To  implement  the  routing  algorithms, 
we  provide  a M-flag  to  represent  the  information  of  mask  layers  on  each  terminal 
site: 

M-flag: 


M(x,y)  = 0 indicates  an  unused  terminal  site  at  (x,y) 

M(x,y)  = 1 indicates  polysilicon  layer  at  (x,y) 

M(x,y)  = 2 indicates  a device  terminal  at  (x,y) 

M(x,y)  = 3 indicates  both  polysilicon  and  diffusion  at  (x,y) 

M(x,y)  = 4 indicates  metal  layer  at  (x,y) 

M(x,y)  = 5 indicates  both  metal  and  polysilicon  at  (x,y) 

M(x,y)  = 6 indicates  both  metal  and  diffusion  at  (x,y) 

M(x,y)  = 7 indicates  metal,  polysilicon,  and  diffusion  at  (x,y) 

M(x,y)  = 8 indicates  a contact  at  (x,y) 

M(x,y)  = 9 indicates  diffusion  run  at  (x,y) 

6.3  Wire  Segment  Generation 

Most  routing  algorithms  deal  with  the  problem  of  connecting  two  enti- 
ties, such  as  two  points  or  a point  and  a wire.  Hence,  given  an  n point  net,  S'  , 
i\iP  in  }>  the  net  is  usually  divided  into  n-1  wire  segments,  each  of 

which  connects  two  points  {p'  ij,v'  ik}-  Three  major  wire  segment  generation 
approaches  [53-64]  have  been  proposed  since  the  late  1950s.  The  first  is  to  chain 
the  n points,  i.e.,  to  find  a permutation  p of  the  signal  points  1.2, ...,n  so  that 
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n — 1 

Yj  ^p(0.p('  + 1) 

1=1 

is  minimum,  where  p(i)  is  the  ith  point  in  the  permutation  and  ( * ) p ( * + 1)  's  the 

distance  between  point  p(i)  and  p(i+l).  Unfortunately,  this  problem  is 
equivalent  to  the  traveling  salesman  problem  [53-58]  and  is  NP  complete.  The 
second  most  common  technique  used  for  wire  segment  generation  is  to  construct  a 
minimal  spanning  tree  [59-62],  which  can  be  done  in  time  no  worse  than  O(n-) 
[62].  The  third  technique  is  to  form  a Steiner  tree  [63-64],  which  is  also  proved 
to  NP  complete.  All  three  tree  structures  for  wire  connection  are  shown  in  Fig- 
ure 6.5. 

Among  the  three  tree  structures,  the  Steiner  tree  usually  results  in  the 
shortest  total  wiring  length  while  minimum  spanning  tree  is  ranked  the  second 
and  chain  connection  the  third.  Gilbert  and  Poliak  [64]  has  conjectured  the 
relationships  among  the  optimal  results  of  the  three  tree  structures  which  can 
be  specified  as  the  following  inequality: 


L $ Ly  <C  <C  2LS 

where  Ls  denotes  the  length  of  the  minimal  Steiner  tree,  Lu  denotes  the  length  of 
the  minimal  spanning  tree,  and  LT  denotes  the  minimal  length  of  the  traveling 
salesman  tour.  The  average  ratio  LS/LM,  according  to  Gilbert  and  Polak's  [64] 
estimation,  is  about  0.866. 

As  the  Steiner  tree  can  result  in  the  shortest  total  wiring  length,  it  is  sup- 
posed to  be  an  ideal  method  for  wire-list  generation.  However,  since  there 
are  no  known  existing  algorithms  which  is  able  to  find  the  Steiner  tree  with 
more  than  five  vertices,  we  choose  minimum  spanning  tree  as  the  tree  structure 
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Figure  6.5  Three  Different  Structures  for  Wire  Connections. 

a)  Chain  Structure;  b)  Minimum  Spanning  Tree;  c) 
Steiner  Tree. 
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for  wire  segment  generation.  Special  routing  techniques  are  then  used  in  final 
routing  process  to  improve  the  result. 

6.3.1  The  Minimum  Spanning  Tree  Algorithm 

A tree  T is  said  to  be  a spanning  tree  of  a connected  graph  if  T is  a sub- 
graph of  G and  T contains  all  vertices  of  G.  Since  the  wire  segment  generation 
problem  is  to  find  a shortest  path  which  interconnects  all  the  grid  points,  it  can 
be  conceptually  modeled  as  a minimum  (shortest)  spanning  tree  problem.  Given  a 
n point  net  S'  ,■  = {p'  ix,p'  i2,....,p'  ,„}  and  a distance  matrix  D = [djk],  where  djk 
is  the  distance  between  point  p'  and  p'  ik , we  represent  the  n point  net  as  a 
weighted  graph  G which  has  n vertices  and  C(n,2)  edges.  Each  vertex  Vj 
represents  the  grid  point  p'  and  each  edge  ejk  — ( Vj,vk  ) represents  a path 
between  vertex  Vj  and  vk.  The  weight  of  an  edge  e]k  is  equal  to  the  distance  djk. 

There  are  several  methods  [59-62]  available  for  actually  finding  the 
minimum  spanning  tree.  One  algorithm  due  to  Kruskal  [59]  has  been  chosen 
and  is  described  as  follows: 

(1)  Sort  all  edges  of  the  graph  G in  order  of  nondecreasing 
weight  and  store  them  in  a list  L. 

(2)  Select  the  smallest  edge  from  L.  Store  it  into  another  list  L’  and 
remove  it  from  L. 

(3)  Select  another  smallest  edge  from  L that  makes  no  circuit  with 
the  previously  selected  edge.  Store  it  into  L’  and  remove  it  from  L. 

(4)  Continue  step  3 until  n-1  edges  have  been  selected. 

(5)  All  the  edges  in  L'  constitute  the  minimum  spanning  tree. 

Figure  6.6  uses  an  example  to  illustrate  the  minimum  spanning  tree  gen- 
eration process.  Figure  6.6(b)  shows  the  weighted  graph  converted  from  the 
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Figure  6.6  Minimum  Spanning  Tree  Generation  Process,  a)  A Multi-Point 

Net;  b)  Its  Corresponding  Weighted  Graph;  c)  Minimum  Spanning 
Tree  on  Grids;  d)  Minimum  Spanning  Tree  on  the  Graph. 
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Figure  6.7  The  Rectilinear  (Manhattan)  Distance  between  Point  A 
and  Point  B. 
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multi-point  net  depicted  in  Figure  6.6(a).  The  minimum  spanning  tree  of  the 
weighted  graph  is  generated  by  computer  and  is  shown  by  bold  lines  on  Figure 
6.6(d).  The  corresponding  minimum  spanning  tree  on  the  grid  structure 
is  shown  as  interconnected  wire  segments  in  Figure  6.6(c). 

6.3.2  Distance  Estimation  in  Gate  Matrix  Structure 

Since  the  generation  of  minimum  spanning  tree  is  based  upon  the  com- 
parison of  the  distance  functions  among  the  grid  points,  the  accuracy  of  the  dis- 
tance measure  is  very  critical  to  the  success  of  a minimum  spanning  tree  algo- 
rithm. For  the  past  two  decades,  rectilinear  distance  (or  called  Mahattan  dis- 
tance) has  been  the  most  popular  distance  measure  function  used  in  layout  algo- 
rithms. As  shown  in  Figure  6.7,  the  The  Manhattan  distance  between  two 
points  (x^t/j)  ( x2,V2 ) is  simply 


DST(A  ,B) 


X1 


x2 


+ 


(6-1) 


However,  in  the  Gate  Matrix  layout  environment,  the  distance  function 
between  two  objects  (can  be  either  points  or  lines)  varies  in  different  situations. 

(1)  Distance  between  two  gate  signals.  Because  each  split  gate  signal  is 
represented  as  a vertical  polysilicon  grid  line  in  the  Gate  Matrix,  net 
connection  between  two  gate  signals  will  be  topologically 
represented  as  a horizontal  wire  segment  between  two  vertical 
lines.  Let  the  horizontal  coordinate  of  the  line  A be  and  horizon- 
tal coordinate  of  line  B be  x2 . Then  the  distance  measure  between 
these  two  lines  can  be  represented  as 


Ill 


DST(A  ,B) 


(6-2) 


(2)  Distance  between  a device  terminal  and  a gate  signal.  The  net  connec- 
tion between  a device  terminal  and  a gate  signal  is  topologically 
represented  as  a horizontal  wire  segment  between  a point  and  a 
vertical  line.  Let  the  coordinate  of  point  A be  (x1,yl)  and  the  horizon- 
tal coordinate  of  line  B be  ar2.  Then  the  distance  between  these  two 
objects  is 


DST(A,B ) 


(6-3) 


(3)  Distance  between  two  device  terminals.  Net  connection  between  two 
device  terminals  A and  B is  topologically  represented  as  a two-point 
wire  segment  W(A,B).  If  both  of  the  two  points  locate  at  the  same 
device  row  or  at  adjacent  device  rows,  the  distance  functions  are 
the  same  as  equation  6-2  and  6-1.  However,  if  these  two  points 
do  not  locate  at  adjacent  rows,  routing  detour  may  occur  if  the 
insertion  of  extra  vertical  routing  tracks  is  not  allowed.  Figure  6.8 
shows  two  extreme  cases  of  routing  detours.  Since  only  the  unused 
terminal  sites  can  be  used  as  the  through  terminals,  the  net  struc- 
ture in  Figure  6.8(a)  requires  a routing  detour  for  a total  length  of  3L 
where  L is  the  width  of  the  device  row.  However,  if  through  termi- 


nals are  all  aligned  at  the  same  vertical  location  as  shown  in  Figure 
6.8(b),  no  routing  detour  occurs.  By  weighting  the  factor  of  rout- 
ing detour  into  the  distance  function,  the  distance  function 
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(a) 


(b) 


Figure  6.8  The  Routing  Distance  between  Two  Device  Terminals, 
a)  Worst  Case;  b)  Best  Case. 
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between  two  device  terminals  should  be  a weighted  equation  which 
gives  the  vertical  part  of  the  distance  more  weight.  The  weighted 
Manhattan  distance  between  two  device  terminals  in  the  Gate  Matrix 
grid  structure  is  given  as 


DST{A  ,B ) 


+ 


V\  ~ V2 


+ 


(6-4) 


where  L is  the  length  of  device  row.  The  number  of  a is  1 if  \y  j — > 2 and  is  0 

otherwise. 


1L4  Detailed  Routing  and  Layout,  Improvement, 

Detailed  routing  is  the  task  to  find  paths  for  the  generated  wire  seg- 
ments. Numerous  routing  techniques  [65-79]  have  been  proposed  in  the  past 
three  decades.  In  general,  those  routing  algorithms  fall  into  three  categories, 
namely  maze  routers,  line  routers,  and  channel  routers.  In  this  dissertation, 
we  use  the  channel  routing  technique  [65-73]  for  several  reasons: 

(1)  Channel  routing  is  a simultaneous  routing  technique  which  usually 

results  in  shorter  total  wiring  length. 

(2)  Gate  Matrix  structure  allows  the  insertion  of  horizontal  routing 

tracks  which  can  be  used  as  routing  channels. 

(3)  Channel  routing  is  a systematic  routing  method.  Channel  densities  in 

both  horizontal  and  vertical  directions  can  be  predicted  before  detailed 
routing  process.  Partial  rerouting  is  also  possible  during  the  layout 
improvement  process.  Consequently,  it  is  an  ideal  routing  technique  for 
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generating  cells  with  different  configurations. 

6.4.1  Channel  Assignment, 

The  first  step  for  a channel  router  is  to  assign  each  wire  segment  to  the 
routing  channels.  If  two  ends  of  a wire  segment  are  located  at  adjacent  device 
rows,  we  assume  that  the  wire  segment  will  be  realized  in  the  routing  channel 
which  lies  in  between  the  two  device  rows.  However,  if  the  two  ends  are  not 
located  adjacently,  the  wire  segment  needs  to  be  placed  in  several  channels. 
The  channel  assignment  process  will  be  different  in  two  different  cases: 

(1)  The  insertion  of  vertical  routing  tracks  is  allowed.  When  the  insertion 

of  vertical  routing  tracks  is  allowed,  no  routing  detour  is  neces- 
sary. For  the  net  structure  in  Figure  6.9,  a vertical  routing  track  can 
be  inserted  just  beside  the  column  of  the  starting  net  terminal.  The 
connection  between  terminal  A and  B can  always  be  realized  by  three 
zigzag  line  segments  as  shown  in  Figure  6.10. 

(2)  The  insertion  of  vertical  routing  tracks  is  not  allowed.  In  this 

case,  we  have  to  find  a through  terminal  at  each  device  row  the 
wire  segment  passes  through.  The  wire  segment  is  then  realized  as  a 
set  of  zigzag  line  segments  as  shown  in  Figure  6.11(a).  Each  fine  seg- 
ment is  assigned  with  the  closest  horizontal  routing  channel.  In 
determining  the  through  terminals  on  each  device  row,  we  use  the 
shortest  path  algorithm  [70]  to  find  the  set  of  line  segments  with  the 
minimum  total  length.  Figure  6.11(b)  is  an  example  of  finding  the 
through  terminals  for  the  non-adjacent  wire  segment. 
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Figure  6.9  A Sample  Net  Structure. 
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Figure  6.10  Channel  Assignment  by  Inserting  a Vertical  Routing 
Track. 
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Figure  6.11  Channel  Assignment  by  Shortest  Path  Algorithm,  a)  Decompose 
the  Two-Point  Wire  into  Several  Short  Wires;  b)  Select 
the  Through  Terminals  Which  Are  Located  on  the  Shortest  Path. 
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Figure  6.12  Single-Row  Routing  Patterns,  a)  Device  Rows  Are  Used 
as  Routing  Channels;  b)  Use  the  Routing  Channels 
between  Two  Device  Rows. 
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6.4.2  Channel  Routing 

After  all  the  wire  segments  are  assigned  to  channels,  the  channel  router  exe- 
cutes routing  on  each  channel.  The  channel  routing  technique  implemented  is 
similar  to  the  one  proposed  in  [66].  However,  since  the  Gate  Matrix  layout  struc- 
ture has  more  than  one  kind  of  routing  scheme,  a combined  technique  is 
used  to  execute  detailed  routing. 

Single-row  routing.  Gate  Matrix  layout  has  a large  amount  of  internal  nets 
of  which  the  device  terminals  are  located  at  the  same  device  row.  Those  nets 
have  the  single-row  routing  pattern  as  shown  in  Figure  6.12.  The  algorithm  used 
to  solve  the  single-row  routing  problem  is  similar  to  the  one  proposed  by  Tsuki- 
yama  et  al.  [66].  All  the  wire  segments  with  both  ends  located  on  the  same 
device  row  are  ordered  by  their  left  edges  as  shown  in  Figure  6.13.  Those  wires 
are  then  placed  on  the  device  row  and  the  channel  located  below  the  device  row. 
The  routing  result  is  shown  in  Figure  6.14. 

Two-row  routing.  All  the  wire  segments,  whose  both  ends  are  located  at 
adjacent  device  rows,  have  the  two-row  routing  pattern  as  shown  in  Figure 
6.15.  The  two-row  routing  problem  is  a classical  channel  routing  problem 
and  is  solved  by  the  constraint  graph  based  [65]  algorithm.  As  shown  in  Fig- 
ure 6.  16,  all  the  wire  segments  which  have  vertical  routing  constraints  are 
solved  by  trunk  partition  (or  called  dogleging)  [69]  technique.  The  number 
of  possible  locations  for  trunk  partition  is  restricted  by  the  number  of  terminal 
sites  on  the  device  rows.  However,  insertion  of  vertical  routing  tracks  can  be 
used  as  the  last  resort  to  create  more  trunk  partition  sites. 
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Figure  6.13  A Single-Row  Net  Structure. 


Figure  6.14  Routing  by  the  Left-edge-first  Algorithm. 
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(a)  (b) 


Figure  6.15  Two-Row  Routing  Patterns,  a)  Using  Device  Rows; 
b)  Not  Using  Device  Rows. 


(b) 


Figure  6.16  Solving  Vertical  Routing  Conflicts  by  Trunk  Partitions. 

a)  Net  1 and  Net  2 Cannot  Be  Realized  at  the  Same  Time; 

b)  A Trunk  Partition  is  Needed  to  Complete  the  Routing. 
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6.5  Layout  Improvement 

Several  tactics  are  implemented  to  improve  the  cell  layout  after  the  rout- 
ing process.  Those  improvement  tactics  emphasize  either  reducing  the  routing 
tracks  or  modifying  the  cell  configuration  but  not  both. 

6.5.1  Reducing  the  Routing  Tracks 

There  are  two  tactics  which  can  be  used  in  reducing  the  routing  tracks. 

(1)  Fully  utilize  the  area  of  device  rows.  In  order  to  simplify  the  algorithm. 

device  rows  are  not  considered  as  part  of  the  routing  channel  during 
the  two-row  routing  process.  However,  available  area  on  device 
rows  can  be  fully  utilized  at  the  final  layout  improvement  stage. 
The  routing  pattern  shown  as  the  left  part  in  Figure  6.17  can  be 
transformed  into  the  corresponding  patterns  shown  in  the  right  side 
of  the  same  figure.  The  transformation  will  reduce  the  number 
of  horizontal  routing  tracks.  Consequently,  the  electrical  properties  of 
the  cell  will  be  improved. 

(2)  Net  transferring.  As  shown  in  Figure  6.18,  the  single  row  nets  can  be 

freely  transferred  from  bottom  channel  to  top  channel  or  vice 
versa.  This  process  will  also  help  in  reducing  the  number  of  hor- 
izontal routing  tracks. 

6.5.2  Modification  of  Cell  Configuration. 

As  described  in  Chapter  IV,  the  cell  configuration  is  determined  mainly  by 
the  number  of  device  columns.  However,  for  a specific  device  column,  the  cell 
configuration  can  also  be  slightly  modified  by  trunk  partition  and  routing  detour. 
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Figure  6.17  Layout  Improvement  by  Fully  Utilizing  Device  Areas. 


9 — 9 9 


6 — 6 6 


Figure  6.18  Net  Transferring  Technique  for  Layout  Improvement. 


(a)  (b) 


Figure  6.19  Modify  Cell  Configuration  by  Trunk  Partition,  a)  One 
Trunk  Partition;  b)  Three  Trunk  Partitions. 
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As  shown  in  Figure  6.19(a),  the  dogleging  routing  schemes  uses  three  horizon- 
tal routing  tracks.  However,  as  shown  in  Figure  6.19(b),  the  number  of  hor- 
izontal tracks  can  be  reduced  to  two  if  extra  trunk  partitions  are  used.  In  gen- 
eral. in  order  to  reduce  one  horizontal  routing  track,  two  more  vertical  rout- 
ing tracks  will  have  to  be  included.  If  the  cell  aspect  ratio  of  a cell  is  ori- 
ginally 


5 = H/W 


6(nr  + nt) 
8 ne  + 6n„ 


(6-5) 


It  will  be 


& =ff/W  = 


6 (nr  + nt  - 1) 
8 nc  + 6 (nv  + 2) 


(6-6) 


after  the  cell  modification  process.  Figure  6.20  shows  a more  sophisticated  dogleg- 
ing routing  example  of  modifying  the  cell  configuration. 
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(b) 

Figure  6.20  Modification  of  Cell  Configuration,  a)  Original  Routing 
Scheme;  b)  Use  a Lot  of  Trunk  Partitions  to  Reduce  the 
Number  of  Horizontal  Routing  Tracks. 


CHAPTER  VII 

SUMMARY  AND  CONCLUSIONS 

7,1  Experimental  Results 

The  CMOS  symbolic  cell  layout  generation  system  has  been  implemented  in 
C language  on  a Gould  minicomputer  under  UNIX  environment.  The  system, 
which  still  needs  improvement,  has  shown  two  great  advantages  over  the  tradi- 
tional manual  design  approach.  The  first  one  is  that  the  design  time  for  a typical 
50  transistor  cell  has  been  reduced  from  several  days  to  less  than  a minute.  The 
short  design  turn-around-time  has  allowed  the  layout  system  to  try  a lot  of  cell 
configurations  and  select  the  one  which  has  the  best  match  with  the  chip  floor 
plan.  The  second  advantage  for  the  system  is  that  it  emphasizes  optimizing  the 
total  chip  area  instead  of  the  individual  cell  area.  Although  the  area  for  each  gen- 
erated cell  may  be  larger  than  the  corresponding  manually  designed  cell,  the  total 
chip  area  is  optimized  through  the  reduction  of  the  cell  shape  mismatch.  Figure 
7.1  and  7.2  demonstrates  two  computer  generated  cells  based  upon  the  cell  boun- 
dary conditions  shown  in  Figure  4.14(a)  and  Figure  4.14(c).  As  illustrated  in 
Table  7.1,  the  area  penalty  for  these  two  cells  are  20%  and  15%  respectivev  com- 
pared to  the  manually  designed  cells.  However,  the  shape  mismatches  between  the 
computer  generated  cells  and  the  apportioned  chip  areas  are  generally  lower  than 
the  manually  designed  cells.  For  the  long-shape  cell  in  Figure  7.1,  the  shape 
mismatch  is  9%  of  the  total  cell  area.  For  the  wide-shape  cell  shown  in  Figure  7.2, 
the  shape  mismatch  is  16%  which  is  higher  than  the  accepted  level.  The  shape 
mismatch,  however,  can  be  reduced  by  the  slight  modification  of  the  cell 
configuration.  Figure  7.3  shows  a different  cell  layout  which  is  generated  by  modi- 
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Figure  7.1  Computer  Generated  Ful ladder  Cell  with  Long  Shape. 
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Figure  7.2  Computer  Generated  Ful ladder  Cell  with  Wide  Shape. 


Table  7.1  Geometric  Information  of  the  Computer  Generated  Cells 


Layout 

Width 

Height 

Cell 

Area 

Area 

Penalty 

Mismatch  (%) 
Area 

Figure  7.1 

52 

88 

4576 

20% 

416  (9%) 

Figure  7.2 

154 

46 

7084 

15% 

1104  (16%) 

Figure  7.3 

124 

54 

6696 

9% 

496  (7%) 

127 


Figure  7.3  Computer  Generated  Cell  Layout  with  a Configuration 
Slightly  Different  from  the  Cell  in  Figure  7.2. 
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fying  the  cell  in  Figure  7.2.  After  the  layout  improvement  process,  the  shape 
mismatch  is  reduced  from  16%  to  7%.  In  general,  the  average  area  penalty  for  the 
computer  generated  cells  is  about  10%  to  20%  compared  to  manual  design 
approach.  Nevertheless,  the  area  penalty  for  the  total  chip  is  below  10%.  The 
reduction  of  total  chip  area  is  due  to  the  flexibility  of  the  automated  cell  layout 
generation  system  and  the  iterative  improvements  between  the  cell  design  and 
chip  floor  planning  processes. 


7.2  Summary 

In  this  dissertation,  we  have  presented  a flexible  CMOS  cell  design  method 
to  enhance  the  function  of  general  cell  layout  system.  The  boundary  require- 
ments of  the  cell  generation  system  were  specified,  and  the  necessary 
techniques  for  each  task  of  the  system  were  developed.  In  Chapter  I,  we 
addressed  the  need  for  a flexible  cell  generation  system.  With  the  growing  use  of 
general  cell  layout  systems,  we  believe  that  research  in  this  direction  should  be 
encouraged. 

In  Chapter  II,  we  surveyed  the  research  progress  in  the  past  ten  years  and 
found  that  flexible  cell  configuration  is  still  a problem  yet  to  be  resolved. 

In  Chapter  III,  we  investigated  the  complete  automatic  layout  system  for 
IC  design  and  characterized  the  functional  role  of  cell  generation  in  the  whole  sys- 
tem. An  integrated  flexible  cell  generation  system  was  then  introduced  which 
intend  to  perceive,  characterize,  and  interpret  the  various  physical  characteris- 
tics of  the  practical  cell  layout  problem. 

In  Chapter  IV,  we  analyzed  the  boundary  constraints  for  cell  generation 
process.  A grid  structure,  with  a desired  aspect  ratio,  was  then  constructed 
based  upon  the  analysis. 
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In  Chapter  V,  the  focus  was  on  device  placement  techniques  which 
determined  optimal  device  terminal  locations  on  grid  structure.  A clustering  tech- 
nique was  introduced  for  the  purpose  of  an  optimized  ordering  among  the 
device  columns.  This  technique  determined  the  horizontal  grid  locations  of 
the  devices.  Determination  of  the  device  orientations  was  based  on  the 
circuit  heuristics  and  practical  wiring  considerations.  A vertical  constraint 
graph  based  technique  was  used  to  determine  the  vertical  grid  locations  of 
the  devices  with  an  emphasis  to  reduce  wiring  detours. 

In  Chapter  VI,  the  focus  was  on  the  development  of  routing  techniques 
to  realize  the  circuit  connectivity.  A terminal  labeling  technique  was  used  to 
transform  the  circuit  connectivity  information  from  the  circuit  domain  to  the 
topological  domain.  Each  multi-terminal  net  was  then  represented  as  a 
multi-point  net  on  the  grid  structure.  A revised  minimum  spanning  tree  algo- 
rithm was  then  implemented  to  decompose  the  multi-point  net  into  a set  of  two- 
point  wire  segments.  A combined  techniques  of  channel  routing  and  maze  rout- 
ing realized  each  wire  segment  with  physical  routing  path. 

7.3  Significance  of  This  Research 

The  first  accomplishment  of  this  dissertation  has  been  the  extensive  study 
of  the  real-time  interaction  between  the  chip  floor  planner  and  the  cell  generator. 
Although  the  study  still  needs  further  work,  the  experimental  results  have 
shown  that  the  integration  of  the  chip  floor  planner  and  the  cell  generator  is  a 
key  factor  in  the  success  of  an  automatic  IC  layout  system. 

Our  second  accomplishment  is  to  develop  a cell  structure  model  which  links 
the  symbolic  layout  domain  and  physical  layout  domain.  Symbolic  layout 
approach  has  a lot  of  advantages  which  include  fast  computation  and  tech- 
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nology  independence.  However,  as  most  of  the  boundary  conditions  and  cell 
specifications  are  given  in  terms  of  physical  parameters,  the  cell  structure 
model  will  help  to  integrate  practical  considerations  into  symbolic  layout  algo- 
rithms. 

A third  accomplishment  is  that  the  complex  cell  layout  problem  has  been 
effectively  divided  into  simpler  subproblems.  The  simplicity  of  each  subprob- 
lem results  in  the  ability  to  characterize  the  various  features  of  each  indivi- 
dual subproblem.  The  "divide  and  conquer"  method  shows  a practical  way  to 
solve  the  real  world  problem.  Some  of  the  innovative  techniques  developed  for 
solving  the  subproblems  should  also  be  a useful  reference  for  future  research  in 
this  direction. 


7.4  Recommendations  for  Future  Research 

In  this  dissertation,  we  have  presented  some  results  in  flexible  cell  genera- 
tion systems.  Although  there  have  been  some  encouraging  results,  the  system  is 
by  no  means  complete.  More  research  work  is  needed  to  enhance  or  improve  the 
functions  of  the  system.  In  the  specific  problem  areas  that  we  have  addressed 
here,  we  recommend  that  further  research  be  carried  on  the  following  areas: 

1.  Although  the  Gate  Matrix  Layout  structure  has  proved  to  be  a com- 

pact structure  for  cell  layout,  it  still  suffers  some  limitation  in 
cell  configuration.  The  minimum  number  of  device  columns  is  equal 
to  the  number  of  gate  signals.  It  inhibits  the  generation  of  thin- 
shape  cells.  A new  layout  structure  with  more  flexibility  in  cell 
configurations  is  needed  to  resolve  the  problem. 

2.  All  the  current  chip  floor  planning  algorithms  assume  that  the  cell 

area  remains  the  same  no  matter  how  its  configuration  changes.  How- 
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ever,  the  assumption  has  been  proved  incorrect  based  upon  the  experi- 
mental results  of  the  cell  generation  system.  Cell  areas  might  differ 
greatly  with  different  cell  configurations.  In  searching  for  an 
optimized  chip  area,  the  chip  floor  planner  need  to  consider  the 
size  variation  problem  together  with  the  configuration  problem. 
Research  in  this  direction  might  be  an  important  area  in  the 
future. 

3.  More  electrical  properties  should  be  considered  in  cell  layout  besides 
the  wiring  length,  the  transistor  abutment,  and  the  cell 

configuration.  It  may,  however,  be  impossible  to  optimize  all  the  lay- 
out parameters  at  the  same  time.  Further  research  in  layout  per- 
formance evaluation  should  be  valuable. 

In  retrospect,  it  is  clear  that  the  design  of  silicon  compiler  to  replace 
designers’  work  has  faced  many  complex  issues  that  are  yet  to  be  resolved.  How- 
ever, the  accumulation  of  all  the  past  accomplishments  has  put  the  "real"  silicon 
compiler  within  reach. 

After  several  years  of  research  in  this  field,  we  came  to  believe  that  a solu- 
tion to  this  problem  exists.  We  hope  that  more  researchers  will  take  up  this 
challenging  problem  in  an  effort  to  design  a better  chip  in  a much  shorter  turn- 
around time. 
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